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ABSTRACT 

We analyze the density field of galaxies observed by the Sloan Digital Sky Survey (SDSS)-III 
Baryon Oscillation Spectroscopic Survey (BOSS) included in the SDSS Data Release Nine 
(DR9). DR9 includes spectroscopic redshifts for over 400,000 galaxies spread over a footprint 
of 3,275 deg^. We identify, characterize, and mitigate the impact of sources of systematic 
uncertainty on large-scale clustering measurements, both for angular moments of the redshift- 
space correlation function, £,i[s) and the spherically averaged power spectrum, P{k), in order 
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to ensure that robust cosmological constraints will be obtained from these data. A correlation 
between the projected density of stars and the higher redshift (0.43 < z < 0.7) galaxy sample 
(the 'CMASS' sample) due to imaging systematics imparts a systematic error that is larger 
than the statistical error of the clustering measurements at scales s > 120/i^^Mpc or fc < 
0.01/iMpc^^. We find that these errors can be ameliorated by weighting galaxies based on 
their surface brightness and the local stellar density. The clustering of CMASS galaxies found 
in the Northern and Southern Galactic footprints of the survey generally agrees to within 
2cr. We use mock galaxy catalogs that simulate the CMASS selection function to determine 
that randomly selecting galaxy redshifts in order to simulate the radial selection function 
of a random sample imparts the least systematic error on £,i{s) measurements and that this 
systematic error is negligible for the spherically averaged correlation function, ^q. We find a 
peak in ^q at s ^ 200/i^^Mpc, with a corresponding feature with period ^ 0.03/iMpc^^ in 
P{k), and find features at least as strong in 4.8% of the mock galaxy catalogs, concluding this 
feature is likely to be a consequence of cosmic variance. The methods we recommend for the 
calculation of clustering measurements using the CMASS sample are adopted in companion 
papers that locate the position of the baryon acoustic oscillation feature ( Anderson et al. 2012| l, 
constrain cosmological models using the full shape of ^o ( Sanchez et al.|2012| l, and measure 
the rate of structure growth ( |Reidetal.|2012p . 

Key words: cosmology: observations, distance scale, large-scale structure 



1 INTRODUCTION 

In the last decade, wide-field surveys such as the Two Degree Field 
Galaxy Redshift Survey ( 2dFGRS;|Colless et al.|2"003) , the Sloan 
Digital Sky Survey (SDSS 
shift Survey (Blake et al.] 



Yorketal.pOOO^, and the WiggleZ Red- 



20 10^ have obtained accurate spectro- 



scopic redshifts of well over one million galaxies, allowing maps 
of the 3-dimensional structure of the Universe to be constructed 
out to 2 = 0.9. These maps encode a wealth of information on cos- 



mology (e.g., Tegmark et al. 2004, Cole et al. 2005 : Eisenstei n et| 
al.|2005||Percrval et al._2010. ; Reid et al._20 10, Blake et al. 2011 



Montesano et al.|2011| l and the nature of galaxies (e.g.JNorberg et 
al.'2002"G6mez et al.'2003'.'Swanson et al.'2008| |Wakeeral.|2008| 
Tojeiro & Percival 2010, Ross et al. 2011a; Ze havi et al.|2011^ . 

The Baryon Oscillation Spectroscopic Survey (BOSS) is de- 
signed to obtain spectroscopy of 1.5 million galaxies to z = 0.7 



over an imaging area of 10,000 deg i Eisenstein et al 



2011 I. 



White 



letal.lpOTl ) investigated an early sample from this survey, confirm- 
ing the survey was making a high-quality map of massive galaxies 
with bias ~ 2. We utilize spectroscopic redshifts for over 400,000 
BOSS galaxies that will be released as part of the SDSS Data Re- 
lease Nine (DR9). These galaxies cover close to 1/3 of the final 
(planned) footprint, and currently comprise the largest effective 
volume (_Tegmark & Peebles_1998j of any spectroscopic galaxy cat- 
alog — 2.2Gpc'^ (assuming a concordance ACDM model). These 
data should therefore allow the best-to-date statistical uncertainty 
on the measured power spectrum, P{k), and thus the best-to-date 
cosmological measurements determined using a galaxy catalog. As 
such, discovery and elimination of systematic uncertainty is of vi- 
tal importance to realizing the survey goals. Potential systematic 
effects on the observed density of galaxies must be robustly tested 
and ameliorated in an un-biased way. 

The purpose of this study is to identify and minimize the im- 
pact of sources of systematic uncertainty in the large-scale cluster- 
ing of BOSS galaxies, in order to ensure robust cosmological con- 
straints are obtained. Ross et al. (201 lb) studied systematic effects 
on the projected density of BOSS galaxy targets, finding a strong 
relationship with stellar density and differences in the sample in oc- 
cupying the Northern and Southern Galactic Cap (NGC and SGC 



from hereon). We follow up and extend this work by testing how 
these systematic variations effect spatial clustering measurements 
and by testing against systematic effects incurred when obtaining 
spectroscopic redshifts. We aim to answer the following questions: 

(i) How do variations in photometric calibration, e.g., between 
the NGC and SGC footprints, affect the selection of BOSS galax- 
ies? 

(ii) How does the observed density of galaxies depend on ob- 
serving conditions? 

(iii) What is the best way to simulate the radial selection func- 
tion and how important are effects related to galaxy evolution? 

(iv) How do permutations of (i)-(iii) affect the clustering we 
measure? 

Our results have already been used in studies of the clustering 
of BOSS DR9 galaxies. [Anderson et al.| (2012b localize the posi- 
tion of the baryon acoustic oscillation (BAO) feature to better than 
2% accuracy. Reid et al. (2012) and measure redshift-space dis- 
tortions (RSD) and S amushia et al.|p012[ > thereby constrain dark 
energy and modified gravity models. See also |Tojeiro et al.H2012| > 
for a complementary method of measuring structure growth using 
DR9 galaxies. [Nuza et al.|J2012) found that the clustering of BOSS 
galaxies can be well approximated by using a sub-halo abundance 
matching method applied to a dark matter simulation. [Sanchez^ 
[ar] ( |2012} obtain cosmological constraints by fitting the full shape 
of the correlation function. We hope that future studies heed, and 
improve upon, our analysis, which we feel is the most careful anal- 
ysis of observational systematics to date. 

The presentation of our analysis is organized as follows: In 
Section [2] we describe the BOSS DR9 sample of galaxies and its 
corresponding angular mask. In SectionlS] we describe how we es- 
timate clustering statistics, their covariance, and compare to mod- 
els. For both the covariance and the models we utilize the mock 
catalogs of galaxies (hereafter 'mocks') generated by [Manera et| 
[ar] ( [2012] >. In SectionH] we investigate and explain the differences 
we find in the densities of galaxies in the NGC and SGC, ad- 
dressing question (i). In Section B] we describe potential sources 
of systematic variation in the density of galaxies targeted for spec- 
troscopy and the methods we employ to remove these variations 
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Figure 1. The footprint of BOSS DR9 galaxies, projected into two dimen- 
sions using the McBryde-Thomas Flat Polar Quartic projection, is shaded in 
blue and red. Areas with CMASS data only are shaded blue. The CMASS 
and LOWZ footprints cover 3275 and 2208 deg^, respectively. The grey 
area represents the final (planned) BOSS footprint. 



in an unbiased way, addressing question (ii). In Section [6] we in- 
vestigate the radial distribution of our galaxy sample, using mock 
catalogs to determine the least biased way to simulate the radial 
selection function of BOSS galaxies and to check that the clus- 
tering we measure is robust when the galaxies are split into two 
samples above/below redshift 0.52, thus addressing question (iii). 
Throughout Sections H] through [6] we address question (iv) using 
£,t{s) measurements at s < 150/i^^Mpc. In Sectionpl we adress 
consider the clustering at scales s > 150ft^^Mpc, also utilizing 
measurements of anisotropic clustering and the power spectrum. 
We conclude in Section [8] Throughout, we assume a flat cosmol- 
ogy with n„i = 0.274, Qth'^ = 0.0224, h = 0.70, Us = 0.95, and 
erg = 0.8 (identical to that used in |White et al.|20lT| and [Anderson| 
|et al.|20l"2ll unless otherwise noted. 



2 DATA 



The SDSS-III Baryon Oscillation Spectroscopic Survey (jEsenstem] 
|et al.|20lT l obtains targets using SDSS imaging data. In combina- 
tion, the SDSS-I, SDSS-II, and SDSS-III surveys obtained wide- 
field CCD photo metry (|Gunn et al.|1998| |2006) in five passbands 
(u,g,r,i,z; e.g., |Fukugita et al.|1996^ , amassing a total footprint of 
14,555 deg^, internally calibrated using the 'uber-calibration' pro- 
cess described in |Padmanabhan et al.| ( |2008| >, and with a 50% com- 
pleteness limit of point sources at r = 22.5 (|Aiha ra et al.|20Il] l. 
After completing the imaging, BOSS has targeted 1.5 million mas- 
sive galaxies, 150,000 quasars, and over 75,000 ancillary target s for 
spectroscopic observation over an area of 10,000 deg^ (Eisenstein] 
|et al.||201I[ >. BOSS observations began in fall 2009, and the last 
data will be acquired in 2014. The BOSS spectrographs (R = 1300- 
3000) are fed by 1000 optical fibres in a single pointing, each with a 
2" aperture. Each observation is performed in a series of 15-minute 
exposures and integrated until a fiducial minimum signal-to-noise 
ratio, chosen to ensure a high redshift success rate, is reached. This 
ensures a sample with nearly isotropic redshift selection complete 
to 98%. We test this isotropy in Section[23] 



galaxies, but a small fraction are stars (3% of CMASS) and high- 
redshift quasars (1%, i.e., not objects sampling the intended den- 
sity field). Anything we refer to as a galaxy has been spectroscop- 
ically confirmed as such. The SDSS measures magnitudes using 
both PSF-convolved fits to DeVaucouleurs (we denote these with 
a dev subscript) and exponential profiles (we denote these with an 
exp subscript). Each of these magnitudes are used to determine 
'model', which we denote using the subscript mod, and 'cmodel' 
magnitudes, denoted using the subscript cmod, which are used in 
target selection. Model magnitudes denote the best-fit of the two 
profiles in the r-band (see |Stoughton et al.|2002| for further informa- 
tion on model magnitudes). The cmodel magnitudes, first defined 
in |Abazajian et al.| ( |2004^ , represent the best-fitting linear combina- 
tion of the exponential and DeVaucouleurs model fluxes. We will 
also use PSF magnitudes, which we denote using the subscript p^f. 

We select BOSS targets using the photometry of objects iden- 
tified as galaxies by the SDSS pipeline. Most of the sample (100% 
for LOWZ and 90.9% for CMASS) was targeted using the SDSS 
DR8 photometry designated as 'primary'. The remaining sample 
was targeted from images now designated in DR8 as secondary. 
This data was superseded by overlapping imaging runs of better 
quality but whose reductions were unavailable at the time of target- 
ing. Photometric scatter across the multiple selection boundaries 
listed below implies that many objects targeted using primary pho- 
tometry would not have been targeted using secondary photometry 
(and vice-versa). However, this result is simply due to the known 
statistical distribution of measured magnitudes, quantified by the 
magnitude error. This effect should not cause any additional sys- 
tematic error beyond that potentially induced by targeting from a 
sample with magnitude errors that vary with angular position, as- 
suming one always uses the photometry used at the time of target- 
ing in an analysis. Indeed, we find restricting our analyses to data 
targeted using DR8 primary photometry results in no significant 
change in any clustering statistic we measure. 

[Eisenstein et al.| ( f20I I| l define the selection criteria for BOSS 
galaxy targets. We repeat them here for completeness and ease of 
reference. The CMASS selection is defined hyr] 



17.5 < icmod < 19.9 

'mod ^mod "^ ■^ 

d± > 0.55 

ifib2 < 21.5 

icmod < 19.86 + 1.6(di- 0.8) 



(1) 

(2) 
(3) 
(4) 
(5) 



where all magnitudes are corrected for Galactic extinction (via the 
[Schlegel, Finkbeiner & Davis|1998| dust maps), ifib2 is the i-band 
magnitude within a 2 aperture, and 



d I = r„ 



,od f-n 



nod fmod 



)/8.0. 



(6) 



These color cuts are designed to obtain a sample of galaxies with 
approximately constant stellar mass with z > 0.43 and include 
many galaxies that would be considered 'blue' by traditional SDSS 
(rest-frame) color cuts (see, e.g., Strateva et al.|200I[ l. Indeed, [Mas-| 
Iters etal.|f20Il| l find that 26% of CMASS galaxies have a late-type 



2.1 Target Selection 

BOSS targets two samples of galaxies. These are the 'LOWZ' and 
'CMASS' samples, as described by Eisenstein et al!] ( |20I 1| . We are 
careful throughout this paper to distinguish between target objects 
and true galaxies — the majority of LOWZ and CMASS targets are 



^ In the early part of the survey, various super-sets of this selection were 
used, e.g., the fiber magnitude limit has changed from ifii,2 < 21.7 to 
*/ii)2 < 21.5. We only use data satisfying the above selection cuts in 
our analysis and recommend the same for any cosmological analysis using 
BOSS galaxy data, as this provides a more isotropic selection and discards 
less than 3% of the total available DR9 CMASS redshifts. 
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Figure 2. The distribution of CMASS targets for a selection of maslc sectors. Circles represent the sky area covered by observing tiles and the number of 
overlapping tiles is indicated by the level of shading. Black dots indicate the positions of targets for which we obtained a 'good' redshift, as defined in Section 
|2.3| Blue squares denote targets for which we did not allocate a fiber for spectroscopic observation because the target is within 62" of another CMASS target 
('close pair'). Green circles denote targets for which we did allocate a fiber that are not close pairs. Red triangles denote targets for which we allocated a fiber, 
but did not obtain a good redshift. 



(i.e., spiral disc) morphology. See|T ojeiro et al.|f2012^ for a detailed 
description of the CMASS population of galaxies. 

For CMASS targets, stars are further separated from galaxies 
by only keeping objects with 



ipsf — imod > 0.2 + 0.2(20.0 — imod) 

Zpsf — Zmod > 9.125 — OAGZmod 



(7) 
(8) 



unless the object also passes the LOWZ cuts (only 0.5% of objects 
passing the LOWZ selection cuts are stars), which are defined by 



rcmod < 13.5 + C||/0.3 

|cx| < 0.2 

16 < rcmod < 19.6 

^psf Tmod ^ U.O 

where 



C|| = 0.7{gmod — rmod) + 1.2{rmod — imod ~ 0.18) 



and 



C_L — Tmod 



- {gmod - rmod)/4.0 - 0.18. 



(9) 
(10) 

(11) 

(12) 
(13) 
(14) 



Some objects satisfy both the LOWZ and CMASS selection crite- 
ria. We therefore apply a minimum (maximum) redshift of 0.43 to 
the CMASS (LOWZ) sample, after obtaining a redshift in order to 
have two mutually exclusive samples. 

The earliest set of spectra obtained for LOWZ data used an 
overly restrictive algorithm designed to remove stellar contamina- 
tion, which unfortunately removed a significant number of galaxies 



from the target sample. This algorithm was changed for later data 
and, to maximize the size of the sample with an isotropic selection 
algorithm, we reduce the area by excluding the regions observed 
with the restrictive algorithm. Thus, the coverage (after account- 
ing for completeness, see Section[Z2l of the LOWZ sample we use 



(2208 deg^) is smaller than that of the CMASS sample (3275 deg^). 
Fig. [T] displays the angular footprint of the LOWZ sample in red 
and the area that contains only CMASS data in blue. The footprint 
contains 327,349 CMASS targets and 132,060 LOWZ targets. All 
of the data in these catalogs will be publicly released in the SDSS 
DR9. 



2.2 Mask 

The BOSS DR9 geometry is constructed from a series of spectro- 
scopic observations, each of which is a 3° diameter circle on the 
sky, corresponding to one pointing of the telescope. Each of these 
circleaj contains a unique set of targets and its area represents a 
'tile' (see |Blanton et al.|2003| and Dawson et al. in prep.). The to- 
tal area covered by these tiles forms the basis of our angular mask. 
Some tiles are not fully covered by observation, usually due to a 
lack of imaging data in the targeted region, so these are additional 
boundaries that we include in the mask. We divide the total area 



^ Each observation corresponds to a 'plate' . Each set of targets has a unique 
tile, but multiple plates can observe the same tile (and thus repeat observa- 
tion of the exact same set of targets). 
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into 'sectors', which are defined as the areas covered by a unique 
set of tiles — i.e. the regions where the spectroscopic observing 
conditions are the same. We use the software package Manglq |(see 



[angltF]( 
et al.pOl 



IHamilton |1 993) [Hamilton & Tegma rk 2004; S wanson et al.|2008> 
to use the tile positions to divide the area into the unique sectors 
that we use to define the mask. 

We also apply veto masks that exclude areas surrounding 
bright stars and imaging fields deemed not photometric, both of 
these processes are described in [Anderson et al.]j2012) . In these 
areas, we do not expect the observing conditions to allow uniform 
detection of BOSS galaxies. Additionally, we mask the 92" diam- 
eter region at the center of every tile where fibers cannot be placed 
due to physical limitations. The veto mask is only applied to the 
data after targeting; that is, galaxies observed in non-photometric 
fields and near bright stars are excluded from our analysis. This 
provides a cleaner sample, albeit with a more complicated mask, 
than if we were to quantify the varying target selection due to these 
effects. These masks remove 4% of the observed footprint. 

The density of galaxy targets on the sky varies due to galaxy 
clustering. Therefore, given that there are a finite number of fibers 
for each tile, the percentage of targets receiving a fiber will vary. 
Additionally, fibers cannot be placed closer than 62" due to the size 
of the cladding around each fiber. We denote a 'close pair' as any 
object not assigned a fiber due to a collision with an object of the 
same target type (i.e., CMASS target with CMASS target), since 
collisions with objects of different types should show no spatial 
correlation. 

In each sector, we compile statistics using the same definitions 
as [Anderson et al.| ( [2012| l. The angular completeness, Cboss, and 
the redshift completeness, Cred are determined by first counting the 
number of objects in each sector that are: 

(i) spectroscopically confirmed stars (A'^star), 
(ii) galaxies with redshifts from good BOSS spectra (A^'gai), 
(iii) galaxies with redshifts from SDSS-II spectra (TVknown), 
(iv) objects with BOSS spectra from which stellar classification 
or redshift determination failed (A'^faii), 

(v) objects with no spectra, in a close-pair (A'^cp), 

(vi) objects with no spectra, not in a close-pair (A^misscd)- 

These definitions represent a complete accounting for the possible 
outcomes of BOSS targets. Objects contributing to A'^missod will 
either be observed in the future or are fiber collisions with objects 
of a different target type. For each sector, [Anderson et al.[ 120121 
then define the following: 

Atarg = A.tar + Agal + Afail + N,p + A„,i,sod + A^known, (15) 

where Atarg is the total number of target objects, and 

Aobs = A^star + A"gal + A^fail , (16) 

where Aobs is the total number of objects within the sector with a 
BOSS spectrum. Cboss is then 



Cboss ~ 



(17) 



(18) 



Aobs + Acp 

jVtarg jVknown 

and finally Cred is 

NohB — Astar 

The Cboss completeness varies from sector to sector due to fiber 

^ http://space.mit.edu/ moUy/mangle/ 
© 2012 RAS, MNRAS 000,[7||28| 



collisions with objects with a different type and the fact that many 
objects will be observed in future observations. We subsample the 
known redshift sample (complete by definition) so that its com- 
pleteness matches Cboss- We discard from our analysis any sec- 
tors where Cboss < 0.7 or dcd < 0.8. These cuts remove 8% of 
the total footprint covered by BOSS DR9, but only 3.5% of galaxy 
redshifts. Making the completeness cuts more restrictive does not 
significantly affect any clustering statistic we measure. 

Fig. [2] displays CMASS targets for a selected observed area. 
Areas covered by more than one tile are shaded such that the outline 
of each tile is clearly visible. These overlapping regions cover 4 1 % 
of the total DR9 footprint. Targets with good redshifts are plotted 
as small black points. Targets not allocated a fiber are green. Within 
a given sector, these should be random with respect to the position 
of other CMASS targets, and these are therefore accounted for by 
Cboss- (Although they are more likely to occur in sectors where 
future observations are planned, they are still random -within these 
sectors.). Targets not allocated a fiber due to close pair collisions 
are blue. This happens most frequently (but not exclusively) in re- 
gions covered by only one tile. Targets that were allocated a fiber 
but whose observation did not result in a good redshift measure- 
ment are red. In general, these occur more frequently near to the 
tile boundaries. We discuss these cases further in Section l23l 

We create random (unclustered) catalogs by isotropically pop- 
ulating the sky, then selecting only those positions lying inside sec- 
tors with Cboss > 0.7 and Crod > 0.8 and outside of the veto 
mask. We then cull the random positions in every sector based on 
their Cboss (i-e-, if Cboss = 0.9, we randomly remove 10% of 
the random points). This process yields a random catalog that mim- 
ics the angular distribution of our galaxy catalogs, save for fiber col- 
lisions with galaxies of the same type, redshift completeness, and 
systematic effects in the imaging. We coiTect for these remaining 
effects using a series of weights, as described in Section [T2| Our 
default approach is to assign to each random position the redshift 
of randomly selected galaxy and we test this approach in Section 
[631 



2.3 Redshift Failures 

We define a 'good' redshift as any galaxy that does not have any 
'zWARNING' flags (as defined in [Adelman-McCarthy et al.|2008[ > 
determined by the spectroscopic pipeline. This flag indicates that 
the redshifts are unreliable typically because there are multiple 
acceptable redshift solutions (usually due to low signal to noise) 
or that the spectrum is defective. Analysis of repeat observations 
of BOSS targets and visual inspection reveal that galaxies with 
zWarning = are reliable (accurate to < 0.001 in A2/(l + z)) 
at the 99.7% level whereas those with zWarning > are reliable 
just 67% of the time. For CMASS targets, good redshifts are ob- 
tained for 98.2% of targets; for LOWZ, it is 99.6%. Although this 
completeness is quite high, one may worry that the failures may 
relate to observational systematics or have a preferred location on 
an observing tile. 

BOSS fibers are numbered such that a given fiber corresponds 
to a particular position on the CCD. The spectrograph optics point 
spread function degrades near the edges of the CCDs, thus lowering 
the quality of the extracted spectra and reducing the likelihood of 
obtaining good quality redshifts from spectra near the CCD edges. 
(See Gunn et al. in prep, for more details on the performance of the 
BOSS spectrograph.) This correlation between redshift quality and 
fiber number translates into a spatial dependence on the sky, given 
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Figure 3. The percentage of failed CMASS redshifts as a function of the 
position on the tile, averaged over 817 DR9 tiles. The lightest regions are 
0% and the darkest regions are 12%. Aq is the distance along the right 
ascension direction and A<5 is the distance along the declination direction 
(both transformed so that the true angular separations are represented). 




redshift 

Figure 4. Galaxy spatial co-moving number density assuming a flat ACDM 
cosmology with Om = 0.274, for CMASS galaxies. The solid line is cal- 
culated for all galaxies, while the dashed line only includes those galaxies 
nearest to a redshift failure, renormalised to match the total density of the 
full sample. The error-bars assume Poissonian distribution for the number 
counts in each bin. 



that fibers are not assigned randomly. In order to test this effect, we 
translate all of the fiber positions of galaxies targeted by BOSS to 
positions relative to the center of the tile. This allows us to deter- 
mine the redshift failure rate as a function of position on the tile 
(and thus whether redshift failures may impart angular fluctuations 
in the density of observed galaxies). The result of this test is dis- 
played in Fig. PI The failed redshifts are not only more likely to be 
on the edge of a tile, but appear concentrated near the minimum 
and maximum right ascension of each tile. We apply weights (see 
Section [T2) to correct for this spatial dependence, but find there to 
be a negligible affect on the measured clustering (see Fig.Bl. 

Fig. H] shows the galaxy spatial number density for the 
CMASS sample, as a function of redshift. We also plot the nor- 
malised (so that it has the same integral) number density against 
redshift for the galaxies nearest to a redshift failure, nnzf(z). Were 
there a strong trend with redshift, for example that we were only 
missing redshifts for high-redshift galaxies, we should expect that 
the nearest neighbours to the redshift failures (which should be se- 
lected with similar properties, such as fiber ID, seeing and extinc- 
tion), should predominantly be at low redshift. In fact we see no 
such trend — if anything we find evidence to the contrary. We esti- 
mate the uncertainty on n„z/ (z) in each z bin by assuming a Pois- 
sonian distribution for the number counts in each bin and we deter- 
mine the x^ when the (normalized) n„zf{z) is compared to that of 
the full n{z). We find x^ = 34.4 for 0.43 < z < 0.7 (27 bins; 15% 
of consistent samples would have a higher x^) and x^ = 23.7 for 
0.5 < z < 0.7 (20 bins; 26% of consistent sample would have a 
higher x^)- We therefore find no evidence that the spatially depen- 
dent component (which is the component that we are interested in, 
as it may create a spurious clustering signal) of the redshift-failure 
probability is dependent on redshift. 



3 ANALYSIS TECHNIQUES 



3.1 Clustering Estimators 



We use the [Landy & Szalay| ( |I993[ l estimator to calculate the 
anisotropic redshift space correlation function, ^{s, /j.), where s is 
the redshift-space separation in h~^Mpc and fi is the angle to the 
line-of-sight. 



^^^^^^^DDisj^^-2DRis^^^ 



(19) 



RR{s,fi) 

where D represents the data sample (i.e., BOSS galaxies) and R 
represents the random sample (occupying the angular footprint and 
with the same redshift distribution as the data sample) and the pair- 
counts are normalised to the total number. 

In linear theory, the first three moments of ^(s, /i), expanded 
in Legendre polynomials, contain all of the information: 



^ds) 



{2e + 1) 



d^P^(At)C(s,M)- 



(20) 



We therefore weight pairs by Pi (yielding separate pair-counts for 
£ = 0,2,4 for each of DD, DR, and RR). Labelling the Pi 
weighted pair-counts with subscript I, we determine £,t{s) via 



2ie{8) _ DDi{s) - 2DRi{s) + RRiis) 



2£+l 



RRo{s) 



(21) 



We count pairs in bin 1 h^^ Mpc wide in s and focus our efforts 
on understanding ^o and ^2, (rather than the full ^(s, /i)) as these 
two measurements are expected to contain almost all of the infor- 
mation. In general, one must be careful, as our procedure implicitly 
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Figure 5. The effect of including weiglits for redsliift failures (w^f, red) and un-observed close-pairs due to fiber-collisions (Wf^, blue) and their combination 
(black) on ^o and ^2- The dashed lines display the Icr statistical uncertainty expected from mock catalogs. 



assumes that pairs are isotropically selected in fi, and this is not 
true for a general survey geometry. In this study, we do not need to 
be concerned, as we will always be comparing our results to those 
obtained via mock catalogs (which accurately match the survey ge- 
ometry; see Section [33] and Maner a et al.||2012) , but it must be 
accounted for in studies that coinpare more general models to the 
data. For a discussion of the procedures one may use, in general, 
when a survey does not provide an isotropically distributed sample 
of pairs, see |Samushia et al^pOl 1| >. 

We also use HEALPix (Gorski et al. 2005) maps, using 
Nside=256, to calculate projected auto/cross-correlation functions, 
(,p{reff) of galaxies and potential systematics (such as Galac- 
tic extinction). We split the sample into redshift shells of width 
Az — 0.01 and define the overdensity, S, in redshift shell and pixel 



[aTjpOlO} . Here the window function is accounted for as a convo- 
lution of the model. This implies power spectra can only be easily 
compared, without coiTecting for varying window functions, when 
they have the same selection and weighting. We therefore predom- 
inantly use the correlation function to present results, but we will 
show P{k) measurements in SectionlT] as these measurements iso- 
late the largest wavelength density perturbations (separate bins in s 
are highly covariant). 



3.2 Weights 

We use weights to account for spatial variations in redshift fail- 
ures, fiber collisions, and imaging systematics, i.e., those effects 
that are not quantified via the Cboss completeness as described 
above. Given the total weight, Wtot, for each galaxy. 



OijZ — ^i,z / ^ave 



(22) 



where Xi is the value of the quantity in question in pixel i and Xava 
is the average of the quantity over all pixels. We can thus calculate 
^p, as a function of the effective scale, re//, using pixelized maps 
via 

ipij-eff) = W 7^ /„NAr /,^,^^r . _ oN (23) 






z2 "'J'.^l 



M'r)Ni{zl)N2(z2)wiWj 



where the indices i,j represent the angular positions of pixels i 
and j and 2I, z2 represent redshift slices, &i,j,z\,z2(r) is 1 if the 
distance between the pixels (as determined by i,j,zl, z2) is within 
the bin defined by r^ff ± Sr^ff and otherwise, N[zi) is the 
number of galaxies in shell z\, and Wi is the weight of the pixel 
(see the following section), which we determine using the random 
catalog. It is straightforward to insert a purely angular map (e.g., the 
Galactic extinction) to determine how its angular cross-correlation 
with the galaxy field is translated to the physical scale re//. One 
simply holds its overdensity field constant with redshift and assigns 
it a flat n{z) and proceeds through the sum defined above. 

We also calculate power spectra, P{k), using the standard 
Fourier technique of|Feldman et al.|([l994|>, as described by|Reid et| 






DD{s,^) = / ,/ ^'Wtot,iWtotjOij{s,fi), 



(24) 



where Qij{s,^) is 1 if the separation between the two galaxies 
and the angle they make to the line of sight is within the particular 
bin, and otherwise. These weights correct the galaxy densities to 
provide a more isotropic selection. They should therefore not be 
applied to a random catalog, if it is based on an isotropic selection. 
We will find that there is systematic relationship between the 
density of targets (see Section Bl and the density of stars, and we 
therefore require a weight, Wstar- We describe how we determine 
Wstar in SectionB] We account for systematics that affect the pro- 
cess of going from the target catalog to a redshift catalog with 
weights for both redshift failures, w^f, and fiber collisions with 
targets of the same type ('close pairs'), lu/e. We start by assign- 
ing each Wzf and ui/c unit weight. We have found that probabil- 
ity of a redshift failure depends on position of the fiber on the tile 
center (see Fig. pi) To account for this, we find the nearest neigh- 
bor on the sky to the object with the failed redshift, and increase 
its Wzf by one. Such a weighting preserves the large-scale angu- 
lar auto-correlation function. Further, this action should not bias 
the redshift-space correlation function, as we find no evidence of 
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a redshift dependence on the probability of a redshift failure (see 
Fig. pi. Similarly, we determine Wfc, by up-weighting the collid- 
ing neighbor by one. |Guo et aEj (pOTT) have developed an opti- 
mized method to account for fiber collisions at all scales, but at 
large scales (s > lO/i'^Mpc), ourmethod should produce identical 
results. We combine the redshift failure and fiber-collision weights 
into a single weight equal to Wfc + Wzf — 1. Thus, the total weight 
we apply to each galaxy is 



■Wtot 



-{■Wfc + W,f -1). 



(25) 



The weight for any galaxy can be arbitrarily large, e.g., a galaxy 
will be given a weight of three if it is nearest to a target with a 
redshift failure and also causes a fiber collision with another galaxy 
(and has Wstar = 1). 

Fig.Bldisplays the difference we find in the measured ^o and 
^2 of CM ASS galaxies when applying the redshift failure (red) 
and fiber-collision (blue) weights and their combination (black). At 
scales greater than 50 h^^Mpc, the effect is negligible compared 
to the expected statistical uncertainty for the sample (displayed in 
dashed black lines). Redshift failures have no significant effect on 
the measured clustering. 

The application of the fiber-collision weights increases the 
clustering amplitudes at scales less than 80h~^ Mpc. This effect is 
expected, as fiber collisions will be more likely to occur in highly 
clustered samples. At 20/i^^Mpc, the difference is nearly la. The 
weights for redshift failures impart a slight decrease in the mea- 
sured amplitudes, at a level consistently less than 20% of the sta- 
tistical uncertainty determined using the mock galaxy catalogs (see 
Section|33j. 

We also want to optimally weight galaxies based on their 
number density, as suggested by |Feldman et al. |( |1994| l. We refer 
to these weights as 'FKP' weights. To do so, we use a c onstant 
PpKP = 20000/i^Mpc-^ (as do [Anderson et" 
roughly the amplitude of the CMASS P{k) at k 
and weight by 

WP = 1/(1 +n{z)PFKp), 



2012 



this is 



O.l/iMpc"^) 



(26) 



where n{z) is the number density of galaxies at redshift z, deter- 
mined using our assumed cosmology. The purpose of these weights 
is to optimally weight areas with different number densities, not to 
correct observed number densities for a systematic effect. Thus, 
they are applied to both the random objects and the real galaxies; 
the final pair counts can be expressed as: 

DDi{s) = y~^ y~^ Wp^DiWtot,DiWp^DjWtot,DjQDiDj{s)Pe{fJ,),(n) 
Di Dj 

DRi{s) = 2JzI/™-P'^*™*°*'^'"'-f''^-'®^'^-'(*)-^^(^)' ^^^^ 

Di Rj 



RRi{s) = y^ y~^ wp^Riwp^RjQRiRj{s)Pi{^). 

Ri Rj 



(29) 



3.3 Covariance Matrices 

We use mock galaxy catalogs with realistic clustering to determine 
covariance matrices for the distributions of galaxies and their clus- 
tering measurements. The mock catalogs have been produced with 
the method explained in |Manera et al.| l [2012) . This method is in- 
spired by the Perturbation Theory Halos (PTHalos) paper of |Scoc-| 
|cimarro & Sheth|([2002|>, and have been calibrated using 40 N-Body 




50 100 150 200 50 100 150 200 50 100 150 200 

s (h"' Mpc) 

Figure 6. Top panels: The average of i;i (s) we determine using 600 mocks, 
for the NGC (black) and SGC (red) footprints and with (solid lines) and 
without (dashed lines) FKP weights. The baryon acoustic oscillation peak 
can clearly be seen at s ~ 100/i^^Mpc in go. Bottom panels: the standard 
deviation of the 600 mocks, using the same scheme as the top panels. 



galaxy and halo mock catalogs generated using the LasDamas sim- 
ulations*] (McBride et al in prep.). 

Using Second Order Perturbation Theory (2LPT), [Manera et| 
[ar] ( |2012] ) generate 600 matter fields at redshift 0.55 drawn using 
periodic boxes of size L=2400 h~^Mpc (one matter field from each 
box). Each uses a flat cosmology defined by flrn = 0.274, flth^ ~ 
0.0224, h = 0.70, Us = 0.95, and as = 0.8. This is the same cos- 
mology as [White et al.| ( [20TT| ) and is close to WMAP7 parameters 
( [Larson et al.|2011 1. Halos from 2LPT runs are identified using a 
friends-of-friends algorithm and are then mass-calibrated using the 
[Tinker et al. (2010) mass function. These halos are then populated 
with galaxies using the halo-occupation-distribution (HOD) param- 
eterization defined by jZheng et al.[ ( [2007p . We determine the HOD 
parameters by fitting the CMASS Co(s) measurement using data in 
a range of 30 < s < 80/i~^Mpc. 

For each mock realization, the periodic boxes are reshaped to 
match the final BOSS geometries for the NGC or the SGC foot- 
prints (see Fig[Tl the box sizes are not large enough to accommo- 
date both NGC and SGC simultaneously). Redshift distortions are 
then applied based on the 2LPT velocity field, combined with a 
model for the intra-halo velocity dispersion. The final DR9 angular 
masks are then applied, and galaxies are sampled from the full sim- 
ulations to match the CMASS radial selection function (displayed 
in Fig [4l, thus yielding 600 mock galaxy catalogs mimicking the 
clustering of BOSS CMASS galaxies. For further details of the 
methods, see |Manera et al.|J2012[ l. 

For each of the 600 mock galaxy catalogs, in both the NGC 
and SGC, we calculate ^f with and without FKP weights (see Sec- 
tion [3.2[ (. The mean results of these measurements (top panels) 
and their standard deviations (bottom panels) are displayed in Fig. 
[6] The FKP weights decrease the standard deviation, typically by 
10%. We note that from here on we do not test ^4 (s) measurements, 
as, in [Reid et aL| ( [201 2) the information added by incorporating 
^4(5) affords only marginal improvements. 

The covariance of £,e{s) at separations Si, S2 and moments 
i, i' is determined using the standard mathematical definition, i.e.. 



http://lss.phy.vanderbilt.edu/lasdamas/ 
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re-scaling the average mock ^™°'^ according to: 



CM(r+ 2/36/ + 0.2/) 



Qbf+p^MM-^'M], 



Co (6,/) = 
6(6,/) = 

Uf) = 
and 

^' = 3s~'-' I S.(r'){r'fdr' 



35 



f[iM 



5 , 
2 Cm 



jCa/], 



c" 



5s" 



C(r')(^ 



(31) 
(32) 

(33) 

(34) 
(35) 



( |Hamilton| 1992[ l where 6 is the real-space linear bias of the galaxy 
sample and / is the rate of change of the linear growth rate: / = 
d log D/d log a, where D is the linear growth rate and a is the scale 
factor of the Universe. We therefore determine our model ^^°'' by 



Co™°"(&,/)=Co" 



100 200 300 100 200 300 100 200 300 
«o 6 ii 

si(/i"^Mpc) 

Figure 7. The normalized covariance matrix of and between ^^ detemiined 
using 600 mock galaxy catalogs simulating the BOSS DR9 selection func- 
tion. 



^ 600 

c,,,,{sus2) = ^E (^K^i) -cr^(si)) {ee{s2)-ar{s2)) .( 

i — l 

To calculate the covariance matrix over the total footprint, we as- 
sume the two regions are independent and thus C^^J^; — C^;|„,j^ + 
^South' ^^ ^^^ ^^ ^^^ covariance matrix to calculate x^ in the 
usual manner and apply tests on how our treatment of the data af- 
fects derived parameters. Any time we quote a x^ value, it is cal- 
culated using the covariance matrix determined from the mocks. 



3.4 Modelling 

We use the mean of clustering measurements determined using the 
mock catalogs to define the fiducial models we test. We derive pa- 
rameters relating to the amplitude of the real-space galaxy density 
field, 6, the amplitude of the galaxy velocity field, /, and the fac- 
tor by which the distances assumed by our cosmological model are 
incorrect, a (as this relates to the position of the BAO feature, see 
[Anderson et al.|20"T2| . We expect that if we obtain robust results on 
these parameters, one should be able to derive robust constraints on 
any derived parameter, over the same range in s. 

We assume a linear model for redshift-space clustering ( |Kaiser| 
|1987| l and assume linear biasing. In this model, given the real-space 
correlation function of matter, ^a/. 



"'=(6^ + 2/36/ + 0.2/^)7(4.657) 
er°"''(4/36/ + 4/7/^)7(2.188) 



?4 \J) 



er°"''/Vo.548, 



(36) 
(37) 
(38) 



The values in the denominators account for a bias of the mocks 
of 6 = 1.9 (fits to the model of |Reid & White|201 l] suggest this is 
accurate to within 2% and we fix it at 1 .9 for simplicity) and the fact 
that/(2; = 0.55) = 0.74forthecosmology used by the mocks (flat 
f7m ~ 0.274; e.g., the denominator for Eq. 36 is given by 1.9^ + 
2/3 X 1.9 X 0.74 + 0.2 x 0.74^ = 4.657). This model thus assumes 
the same scalings in amplitude as linear RSD theory, but uses the 
shape of the the mean mock galaxy Cf (s) , which include non-linear 
RSD features. We note that linear theory is not appropriate to obtain 
accurate estimates of 6 and / (see |Reid & White|201 Ij l; it is for this 
reason that we quote our measurements as, e.g., 6 rather than 6. 
However, we expect that if our treatment of the data yields robust 
estimates of 6 and /, the measurements will also be robust when 
more accurate models are tested in |Samushia et al.| ( |2012| ) and |Reid| 
[eraL] ( |20T2) . 

We also test the robustness of the data to a simple dilation in 
cale, using models where we vary a 'stretch parameter', a. In this 
case 

Cr''(&,a,s) =Cr'''(Qs)(b'+0.486 + 0.1036)/(4.657), (39) 

(where we are now fixing /). We determine ^™°'='= (aa) via power- 
law interpolation of the fiducial mean mock result at scales s < 
80/i^^Mpc and linear interpolation at larger scales. We note that 
this stretch parameter will contain information on, at least, flmh^ 
(from the overall peak of the power spectrum) and the position of 
the BAO peak. We therefore believe that if our treatment of the data 
yields robust a values, robust measurements of both the BAO po- 
sition and Q,mh? will be obtained when more sophisticated models 
are applied, as in |Anderson et al.| ( |2012^ and |Sanchez et al.| ( |2012| ). 
We study the recovered values of 6, /, and a according to Galac- 
tic hemisphere (Section |4.2^ , angular weights (Section |5.3[ ). and 
redshift (Section [6.2^ and these results are summarized in Tablefl] 
found in Section|8] 



3.5 Default Survey Window 

The following three sections define and justify our recommenda- 
tions for how to treat the survey window in regard to the NGC and 
SGC footprints, photometric systematics, and the radial selection 
function. In each section, these recommendations are used, unless 
otherwise noted. The default is to: treat the NCG and SGC as hav- 
ing separate selection functions and optimally combining their in- 
dividual pair-counts, apply weights to each galaxy based on linear 
relationships between ifib2 magnitude and stellar density, and ap- 
ply redshifts to the random sample by randomly selecting redshifts 
from the galaxy sample. 



4 DEPENDENCE ON GALACTIC HEMISPHERE 

The SDSS imaging was carried out in two large contiguous areas in 
the NGC and SGC (see Fig.fll. The mean sky background and air- 
mass are both higher for the SGC imaging, which results in larger 
uncertainties on the measured magnitudes of this data, as the mean 
uncertainty for i-band CMASS targets is 0.076 in the NGC and 
0.101 in the SGC. However, we show in Sections|5]and|6]that the 
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Figure 8. The distribution of DR9 spectroscopically-identified galaxies as a function of color combinations used in their selection, for objects in the Northern 
(black) and Southern (blue) Galactic Caps, and in Southern Galactic Cap after applying the Schlafly & Finkbeiner (2010) offsets to the target selection (red). 
Dashed vertical Unes display the location of the cut applied to BOSS targeting. 



projected number density and redshift distributions of BOSS targets 
do not depend on either sky background or airmass and we there- 
fore find no evidence that the observing conditions should produce 
systematic differences between the properties of targets selected 
in the SGC and NGC. Of greater concern is the fact that the two 
regions are tied together with relatively few scans to measure the 
relative photometric calibration jPadmanabhan et ar]|2008^ . This 
suggests the possibility of a significant photometric offset between 
these two regions. 

[Schlafly et all J20T0) and |Schlafly & Finkbeiner] pOTT] l have 
found systematic variations in the colors of the population of SDSS 
stars as a function of their position. These offsets reflect a combina- 
tion of variations in stellar populations across the Galaxy, cafibra- 
tion errors in the SDSS photometry (at the 1% level), and errors in 
the corrections for Galactic extinction. In particular, they find that 
there is a systematic offset in the measured photometry between the 
SGC and NGC (the amplitude of this offset is within the expected 
1% rms of DR8 photometric calibration errors). The CMASS cut 
is sensitive to the dx color, both due to the hard cut (Eq.pl and 
the sliding cut (Eq. [5l. The LOWZ sample is sensitive to the c\\ 
color, due to its sliding cut (Eq.[9|l. rSchlafly & FinkbeinerH20lT) 
find a 0.015 mag offset in C|| and a 0.0064 mag offset in dx be- 
tween the NGC and SGC (based on their 'spectrum based' method; 
see their Table 6). [Ross et al.| ( |201 Ib^ found that the 2% difference 
in the number density of CMASS targets between the NGC and 
SGC hemisphere was consistent with this offset in d± . In what fol- 
lows, we repeat and improve upon the analysis performed in |Ross| 
|et al.| ( [2011b| l using only spectroscopically confirmed galaxies (al- 
beit over a footprint 1/3 the size). 




redshift 

Figure 9. The galaxy spatial number density assuming a flat ACDM cos- 
mology with Q.m = 0.274 of CMASS objects in Northern (NGC) and 
Southern Galactic Caps (SGC). The red line displays the result when we 
apply the Schlafly & Finkbeiner (2010) offsets to the target selection in the 
SGC. The eiTor bars are determined using 600 mock catalogs cut to the 
angular footprint of the SGC. 



4.1 Number densities 

Fig.lSldisplays the distribution in galaxies vs. the color/magnitude 
information used to select them. We show the relations for spec- 
troscopically identified galaxies and apply the redshift failure and 
close pair weights described in Section[T2]when determining num- 



ber densities. The left panel shows the distribution of LOWZ galax- 
ies against the value of the sliding cut, with the relationship for the 
NGC plotted in black and the relationship for the South plotted 
in red. At the cut (at 4.05 mag) the slope of the number density 
relationship is roughly 1.4x 10^ galaxies per steradian per magni- 
tude change in C||. Thus, a 0.015 mag offset in C|| (as implied by 
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Schlafly & Fiiikbeiner|2011 1 should cause a change of ~ 2x 10"' 
galaxies per steradian. If we apply this offset, we get the distribu- 
tion in blue in the left-hand panel of Fig. [8] The curve appears much 
more consistent with the distribution in the NGC (black) than the 
fiducial SGC relationship (red). 

In the DR9 CMASS sample, we find a 3.2% higher number 
density in the SGC than in the NGC (2.832 x 10^ str"^ compared 
to 2.744 X 10*^ str"^). This is 1.2% higher than found by JRoss et al.| 
pOTlbl due to a combination of the different footprints and the fact 
that we use spectroscopically identified galaxies. The middle panel 
displays the distribution of CMASS galaxies against their value of 
d±, and the right-hand panel against the sliding cut. For the d± > 
0.55 cut, we should expect 7x 10^ galaxies per steradian per mag 
change in d±, and for the sliding cut we should expect an extra 
5 X 10^ change in the number of galaxies per steradian per mag 
change in d± . Thus, we should expect a change of roughly 1.2x10^ 
targeted galaxies per steradian per mag offset in d± (the two cuts 
are not strongly covariant). Therefore, given the 0.0064 mag offset 
in di_ between the NGC and SGC, we should expect 7700 galaxies 
per steradian in the SGC. This is 2.7% of the number density of the 
CMASS sample. When we apply this offset to the target selection, 
we find there only remains a 0.2% excess in the number density of 
objects in the SGC compared to the NGC. 

Fig. [9] displays the number density as a function of redshift, 
n(z), for CMASS galaxies in the NGC (black) and SGC (blue), 
using bins of width Az = 0.01. The number density is 10% smaller 
around the peak of the distribution in the South, and this becomes 
more dramatic when we apply the [Schlafly & FinkbeinerH2011^ 
offsets to the target selection. However, the number density is 20% 
larger atz — 0.6. The error bars are determined from the variations 
we find in the n{z) of mocks cut to the same angular footprint as 
our Southern data set (the n{z) for each individual mock varies 
due to the cosmic variance inherent in large-scale-structure). Using 
these mocks, we can also determine the covariance between the 
n{z) bins, thus allowing us to calculate the x^ between the SGC 
and the NGC. When the offsets are applied, we find x^ ~ 36 (for 
27 bins). The x^ is higher (39) when the offsets are not applied to 
the target selection. We find that 55 of the 600 Southern mock n{z) 
have a x^ th^t is greater than 36 (when compared to the average 
of the 600 mocks) and 34 have a x^ that is greater than 39 (this is 
roughly in line with the probabilities of 11% and 6% one obtains 
for these x^ values and 27 degrees of freedom). Thus, applying 
the offsets makes the redshift distributions of the NGC and SGC 
samples more consistent, but the differences between them slightly 
unusual. 

In summary, for both the CMASS and LOWZ samples, we 
find that the difference in their number densities is consistent with 
the level of color offset between the NGC and SGC as determined 
by |Schlafly & Finkbeiner| 1 ^201 1 1 using their spectrum method. Fur- 
ther, the offset is understood — it is within the expected rms of 
DR8 calibration errors and found between the two regions that have 
the least available data for relative calibration. We can apply these 
color offsets to the selection of galaxies in the South in an attempt to 
make a homogenous sample, as doing so only makes the cuts more 
restrictive. However, there is some uncertainty inherent in the level 
of the offset, and, based on the mocks, the expected variance in the 
number density between the NGC and SGC is 2%. Further, the n{z) 
distributions remain slightly inconsistent. We therefore believe the 
most conservative approach is to treat the two samples as having 
separate selection functions, due (at least in part) to the fact that 
there are offsets in the photometry between the two regions. Thus, 



we analyze all galaxies observed in the South separately, accepting 
they comprise a denser sample than those in the NGC. 

4.2 Measurements of Clustering 



Fig. 10 displays the measured ^(s) in the Northern (NGC; red) and 
Southern Galactic Caps (SGC; blue). These measurements include 
the weights that correct for stars, fiber collisions, and redshift fail- 
ures, defined by Eq.[25]and also the FKP weights. The area covered 
by the SGC data is only one quarter that of the NGC, and therefore 
the uncertainty in the SGC ^(s) is about twice as large as the NGC. 
At almost all scales, the ^o measurements appear consistent with 
each other; the only notable exception is a significant dip in the 
measurements at 170 h^^Mpc. Both of the £,o{s) measurements 
display a prominent increase in clustering at around 100 /i~^Mpc, 
suggesting significant BAO peaks, though the peak does appear at a 
smaller scale in the Southern measurement. Interestingly, both the 
NGC and SGC measurements appear to also have a peak in (,o{s) 
at around s — 215/i~^Mpc, but we note that the uncertainty on the 
measurements at these scales is much larger than around the BAO 
scale. The ^2 measurements appear slightly less consistent, espe- 
cially between 75 and 95 h~^Mpc, where the measurements are 
clearly inconsistent within the la error-bars. 

The black points in Fig.fToldisplay combined NGC and SGC 
measurements, which are produced by summing the DD, DR, and 
RR pairs, which is appropriate when FKP weights are used. The 
number of randoms in each region use the same normalization with 
respect to the number density in each region (we use just over 15 
times the number of galaxies). Thus, the relative normalization of 
randoms to galaxies between the two hemispheres is matched as if 
the samples were treated individually, and the results are optimally 
combined. 

We test the consistency of the measurements by summing the 
covariance matrices of the NGC and SGC and determining x^ in 
the standard fashion. For s < 250/i~^Mpc (35 data points), we 
find x^ ~ 45.4 for ^0 and 32.0 for ^2, so despite the apparent 
differences, ^2 is actually more consistent than ^o- For ^0, the x^ is 
slightly high — 1 1% of consistent samples drawn from a Gaussian 
distribution will have a higher x^ • Reducing the range of the fit to 
25 < s < 150/i~^Mpc (the primary range we study), the x^ is 
23.2 (18 data points), which sh ows consistency (18% of consistent 
samples will have a larger x^)- 



Sanchez et al. 



1 2012 1 find the same 



X /dof (1.3) when they fit their ^(s) measurements, which use an 
alternative binning, between 40 < s < 200/i~'^Mpc. Scaling to 
the NGC sample, we find that the best-fit relative bias, b^ei, of the 
SGC sample is 1.057±0.038 (xLin = 20.9), when fit to 25 < s < 
150ft~^Mpc. Relaxing the minimum bound to s = 10h~^ (adding 
two data points), we find brei ~ 0.983 ± 0.015 with Xmin ~ 
20.5, bret = 1 is just outside the la bounds, suggesting that the 
clustering in the two regions is consistent to within I.Ict. 

We further test the consistency of the NGC and SGC mea- 
surements by finding the best-fit bias for the NGC and SGC 
samples by scaling the mocks, as described in Section |3.3| in 
the range 25 < s < 150/i"^Mpc. We find the best-fit b = 
1.904±0.039, with xLtn = 24.3 (18 measurements) for the NGC 
data and 2.06±0.07, with Xm»n = 18.8 for the SGC. The dif- 
ference is nearly 2a. The best-fit bias of the combined sample, 
b = 1.936 ± 0.035, is very close to the weighted average of the 
two samples (b = 1.943 ± 0.035). Increasing the minimum scale 
to s = 30/i^^Mpc, we find a significant change in the best-fit bias 
for the NGC (6 = 1.87 ± 0.05), but we find negligible change 
for the SGC (6 = 2.08 ± 0.11). For the combined sample, the 



© 2012 RAS, MNRAS 000,[T](28 



12 A. J. Ross et al. 



100 



m 
X 

m 



50 - 



- 




0.02 - 



u 
6 0.01 



CO 

< 
u 



-0.01 



-0.02 



^ Conibined 
o NGC 
o SGC 




50 



100 



150 



200 



50 



100 



150 



200 



s (h~^Mpc) 



Figure 10. Top panels: The measured redshift space correlation functions, ^o ai^d ^2. of CMASS data in the Northern (NGC; red) and Southern Galactic Caps 
(SGC; blue), and their pair-weighted average (black triangles), using FKP weights and the Wstar weights. The error-bars are the standard deviations of the l;i 
in the mocks drawn from the SGC footprint. For both the NGC and SGC measurements, the BAO feature is apparent at s ~ 100/i~^Mpc. Bottom panels; The 
difference between the measured go, 2 of the NGC (red) and SGC (south) CMASS samples and the mean of their respective mocks, after scaling the mocks for 
a best-fit bias. The eiTor-bars are the standard deviations of the i;i in the mocks drawn froin the respective SGC and NGC footprints. 



bias decrease is even larger, as b = 1.886 ± 0.048 when we fit 
30 < s < ISQ/i^^Mpc. The values of b (and other parameters we 
measure throughout this section) are summarized in Table [Tl (found 
in Section[8}. 

Fixing the bias at the best-fit value from ^o, we find the best- 
fit value of / from the ^2 measurements by scaling the mock ^2- 
For the NGC, we find / = 0.691 ± 0.052 (with xLin = 10.9) 



and for the SGC data / 
the combined sample, / 



0.79 ± 0.09 (with xi 
0.711 ± 0.044 (xLn 



= 11.8). For 
10.2) — very 



similar to the weighted average of 0.716 ± 0.045. This suggests 
that the information content in ^2 (s) measurements related to the 
velocity field is consistent between the two regions. 

As described in Section[3]4] we can use the mocks to fit for a 
bias and stretch parameter, a. We stress that these a values should 
reflect both changes we expect in the best-fit Qmh^ and distance 
constraints one may obtain from the BAO feature, i.e., it only re- 
flects the level of disagreement we should expect in derived cos- 
mological parameters using the NGC/SGC footprint, and whether 
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Figure 11. The projected number density of CMASS galaxies as a function of tlie potential systematics: stellar density (nstar). Galactic extinction in the 
r-band (Ar), the i-band seeing (see^), j-band background sky flux in nanomaggies/arcsec^, denoted nmagg, ('sky^'; nmagg are related to the inagnitude 
m, via m =22.5 -2.5log(nmagg)), and the airmass (air). The black lines display the results for the CMASS sample, and the green the results for the LOWZ 
sample. The red lines display the residual CMASS relationships after applying weights that account for the linear relationships between galaxy density, stellar 
density, and fiber magnitude (Wstar)- The expected errors are determined by finding the standard deviation of the relationships measured from individual mock 
CMASS catalogs. 



the level of disagreement is consistent with what we expect to find 
given cosmic variance, but it contains no information on specific 
parameters. For the NGC footprint, we find Xmin ~ 22.1 at 3 = 
0.990,6= 1.888; marginalizing over the bias, a = 0.994± 0.023. 
For the SGC footprint, xLrn = 12.3 at a = 1.090,6 = 2.319; 
a — 1.083 ± 0.029 when we marginalize over bias. For the com- 
bined sample, we find Xmin = 21.5 at a = 1.019, b = 1.982; 
marginalizing over the bias, a = 1.020 ± 0.019. This is smaller 
than the weighted average, 1.028 ± 0.018, of the two a measure- 
ments, but it is still greater than a Icr shift from the result obtained 
using only the NGC data. This implies that one may find differ- 
ences of Icr on recovered cosmological parameters when compar- 
ing results from only the NGC data to the combined sample. In- 
deed, [Sanchez et aLl (2012l find this level of variation. 

The difference between the NGC and SGC a values we mea- 
sure is 2.5a. We find negligible changes in the values of a we 
obtain from the measured SGC ^o if we apply the [Schlafly "&| 
[Finkbeiner] ( |2011[ ) offset to the selection of SGC CMASS galax- 
ies, use any of the separate weighting schemes described in the 
appendix, or neglect to apply any weight at all; that is, we have not 
been able to identify any systematic that may cause the differences 
we observe. Any true difference would represent a violation of 
isotropy. [Anderson et al.| ( |2012| ) find that the tension between their 
BAO scale measurements is reduced to 1.4a when reconstruction 
is applied to the CMASS galaxy density field (and is 2.5a without 
reconstruction). As reconstruction generally improves the signal- 
to-noise in BAO scale estimation, this reduction in the tension be- 
tween the two measurements implies the difference is indeed driven 
by noise. 

In general, the level of disagreement between the NGC and 
SGC correlation functions is between 1 and 2a. The differences in 
the bias when scaling the mocks (1.9a) and when we fit for a rel- 
ative bias (brei = 1 is 1. 5a from the best-fit) are both less than 
2a. The n{z) distributions disagree at a similar level of signifi- 
cance. The n{z) discrepancy is likely related to the disagreement 
in the clustering. Finally, when fixing a = 1.017 (the upper Icr 
bound on a from the NGC sample), xf„in ~ 16-4 (meaning the 
X'^/dof is less than one) when testing the ^o of the SGC sample (at 
b = 2.11). The SGC footprint is cuiTently only 705deg^ — 28% 



of its final (planned) size (2500 deg ). If the differences are indeed 
in the noise, we expect that all results between the NGC and SGC 
will become more consistent as the BOSS survey continues and the 
sample grows. 



5 ANGULAR VARIATIONS IN TARGET CATALOG 



|Ross et aTjpOI Ib| l found significant correlations between the num- 
ber density of galaxies in the SDSS imaging data with a photo- 
metric selection similar to that of the CMASS sample, and var- 
ious parameters. In particular, the number density of observed 
galaxies decreased significantly as a function of the stellar den- 
sity. We repeat the tests performed by Ross et al. (2011b), now 
using spectroscopically-confirmed galaxies. We are only using data 
within the DR9 mask (which is about 1/3 of the imaging area used 
in |Ross et al.|201l"b) and we now have access to mocks that allow 
us to quantify the statistical variations we should expect to find. 



5.1 Galaxy density vs. potential systematics 

We determine the number density of the DR9 spectroscopically- 
identified galaxies as a function of stellar density, seeing. Galactic 
extinction, airmass, and sky background (all during the imaging ob- 
servations). To perform these tests, we make HEALPix ( [Gorski eT| 
al. 2005 ) maps of DR9 galaxies and compare them to maps of the 
number of stars with 17.5 < i < 19.9 or maps of the mean val- 
ues of the potential observational systematic based on data from the 
SDSS DR8 Catalog Archive Server (CAS|^within pixels at Nsidc 
= 256, which splits the sky into equal area pixels of 0.0525 deg^. 
Rather than pixelate the mask, in each pixel we determine the num- 
ber of galaxies, Ugai and the number of randoms, riran , (multiplied 
by the factor ngai,tot/nran,tot), and therefore map Ugai/riran. 

Fig. [TT| displays the relationships between the number of 
galaxies observed and potential observational systematics. We 
weight each galaxy for redshift completeness and close pairs as 



http://skyserversdss3.org/dr8/en/ 
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described in Section [32] and we combine the NGC and SGC data 
by using the same normalization of randoms to galaxies in each re- 
spective region. We find no significant differences in our analysis 
if we analyze the two regions separately. We apply the same tests 
on the 600 mocks (described in Section [33]l and use the standard 
deviation as the errors in each bin displayed in Fig. [TT] The rela- 
tionships for CMASS galaxies are displayed in black. As in |Ross| 
|et al.lpbllb^ , we find a 10% decrease in the number density of 
galaxies between areas with high and low stellar density. 

As quantified in Section 4.1 of |Ross et aL]j2011b| l, 3% of the 
decrease in galaxy density results from the fact that, within 10" of 
stars, seeing reduces the ability to detect galaxies (with little depen- 
dence on the magnitude of the star between 17.5 < imod < 19.9). 
The relationship is not found in DR7 data (Bauer, A., private com- 
munication). The most significant change between the DR7 and 
DR8 photometric pipelines was a refinement in the sky background 
subtraction algorithm, in order to improve the photometry of bright 
galaxies (Aihara et a l.|201 1} . One effect of this change is to increase 
the low-surface brightness extent of both galaxies and stars, caus- 
ing more objects to be linked together. In regions of higher stellar 
density, this means that the deblender more often has to deal with 
complicated superpositions of many objects. To control processing 
time, the deblending code will separate out up to 25 overlapping ob- 
jects in one parent, but no more. This happens more often with the 
new code, meaning that there are more missing galaxies in regions 
of high stellar density than before. This may explain the remaining 
7% effect. 

We also find an anti-correlation with Galactic extinction — 
this is at least partly due to the fact that the Galactic extinction and 
stellar density are correlated. Indeed, we find that the correlation 
with extinction becomes insignificant once weights are applied (see 
SectionlS^ to correct for the relationship with stellar density. See 



|Yahata et al.| ( |2007} for a more detailed study on the ways in which 
the Galactic extinction, as determined by the Schlegel, Finkbeirier] 
|& Davis| ( |1998[ > dust maps, correlates with the observed density of 
galaxies. 

We find a sharp decrease in the number density of galaxies 
in areas with poor seeing; this effect was explained in [Ross et al.| 
pOTlbl as being due to the fact that the star/galaxy separation cri- 
teria defined by eqs. [7] and [8] remove more true galaxies in areas 
where the seeing is poor. This systematic relationship had little ef- 
fect on the measured clustering in |Ross et al.H201 fb] !, as the pattern 
of seeing in the DR8 imaging is essentially random on large scales. 
For sky background and airmass, the level of fluctuations are close 
to what we should expect due to cosmic variance (as represented 
by the error bars). 

The relationships for the LOWZ sample are displayed in Fig. 
fTTlwith green lines. In contrast to the CMASS sample, we do not 
find any systematic dependency with stellar density, Galactic ex- 
tinction, or seeing. This is likely due to the fact that LOWZ galaxies 
are, on average, considerably brighter than CMASS galaxies, and 
their detection should therefore be less affected by imaging system- 
atics. Given that the volume of the LOWZ sample is considerably 
smaller than the CMASS sample, we should expect larger cosmic 
variance. Indeed, it appears that all of the variance in number den- 
sity we find for the LOWZ sample can be attributed to cosmic vari- 
ance. 

In [Ross et al.|p011b^ , the relationship between the number 
density of galaxies and stellar density was found to depend on 
the surface brightness of the galaxy. Given that the mean surface 
brightness of CMASS galaxies is lower at higher redshift, the rela- 
tionship between the galaxy density and stellar density may de- 
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Figure 12. Same as the black points in the left panel of Fig |ll| except that 
we have broken the CMASS sample into three subsamples based on the 
labeled fiber magnitudes, ifib2- ifib2 < 20.75 has 78,065 good redshifts, 
20.75 < i/ii,2 < 21 has 85,284 good redshifts, and ifu,2 > 21 has 
112304 good redshifts. 
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Figure 13. The cross-correlation with the full CMASS sample, i;p,x (feff)^ 
squared, divided by the auto-correlation, ^p, for stars (orange). Galactic 
extinction (green), the sum of seeing, sky background, and airmass (red), 
the sum of all five when applying the linear-fit weights to all five potential 
systematics (wmcmC^ blue, which we consider in the appendix), the sum 
of all five when applying the linear-fit weights for only stellar density as a 
function of ifn,2 {Wstar, light blue), and the mean sum of all five and its 
standard deviation on the mocks (black error-bars). The dotted black line 
displays the expected statistical uncertainty, determined from the variance 
of the mock ^o{s) measurements. 
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Figure 14. The best-fit coefficients to the relationship rigai/nran = A + Bustar as a function of the ifn,2 magnitude of the galaxies. The blue points display 
the results when we use FKP weights (see Eq. |26^ and the red points show the results when we do not use this weighting. The black lines display our fit to 
these coefficients, which we use to determine weights as a function of if 0,2 and nstar- 



pend on redshift. In Fig. [T2] we show the relationship between 
galaxy density and stellar density when splitting the sample into 
three sets based on the ifib2 magnitude, as this magnitude uses a 
fixed aperture and is thus essentially a surface brightness measure- 
ment. The slope in the relationship clearly grows more negative at 
fainter ifib2- The fact that the effect correlates strongly with sur- 
face brightness is further evidence that it is related to a systematic 
in the DR8 imaging, likely related to the sky subtraction routine, as 
opposed to a real large-scale density fluctuation perfectly aligned 
with the Galaxy. 

For each of the potential systematics displayed in Fig.fTT] we 
determine the auto-correlation, ^p, and cross-correlation, ^p^x with 
the CMASS sample as a function of the effective scale, re// (all 
of which are defined by Eq. |23]and the surrounding text). As de- 
scribed in |Ho et ^ ( [2012| l and Ross et aL] ( [201 lb| , the effect of any 
potential systematic on the measured correlation function can be 
estimated as S,p,x{reff)^ /(.p{reff)- Fig. 13 displays this ratio for 



the five potential systematics displayed in Fig.fTT] We confirm with 
the spectroscopic sample the result found in the angular clustering 
analysis ( jRoss et al.||2011bl l: the presence of stars has the great- 
est systematic effect. The effect of Galactic extinction is second 
largest, but |Ross et al.| ( [2011b^ found it to be almost entirely de- 
generate with the effect of stars. The sky background, airmass, and 
seeing all have negligible effects. However, we have found more 
significant correlations with sky background when the sample is 
split further by color. Overall, we should expect a difference of ~ 
0.002 between fiducial ^(s) measurements and those with correc- 
tions for systematics. 

The mean of the sum and standard deviation of 
i.p,x{reff)^ /S.p{reff) for all five potential systematics we 
consider overf the mock catalogs, are displayed with black 
error-bars in Fig. [O] This is non-zero on average because the 
auto-correlation of each systematic is positive and the cross- 
correlations, which have mean, are squared. Thus, we should 
expect a non-zero mean on the sum of these contributions, even 
when there is no real systematic effect. 



5.2 Angular Weights 

As shown in Fig.fTS] the primary source of systematic error is due 
to the relationship with stellar density. To account for this system- 
atic effect, we apply weights that counteract the systematic rela- 
tionship. Fig. [12] suggests that the systematic relationships depend 
on the surface brightness of the galaxy. We thus use this infor- 
mation in order to determine 'linear-fit stellar density weights', 
which we denote Wstar- We split the sample by ifib2 and as- 
sume Ugai/nran = A + Bustar for each sub-sample, now using 
Naide ~ 128 for the resolution of the maps. The result of this ap- 
proach is shown in Fig. [14] We find that for ifib2 < 20.45, the 
relationship is consistent with being constant. At fainter ifib2, we 
find a linear relationship with A[ifib2) and B(ifib2), and we thus 
use this linear fit to determine the Wstar weights (which ignores the 
rest of the potential systematics). The linear fit is given by 



B/{deg^ 



A = Ao + A^_i 
Bo + Bit 



fib2 
fib2 



(40) 
(41) 



where Ao = 3.96, Ai = -0.14, Bo = -1-18 x 10"^, Bi = 
5.76 X lO^"" (for the case where the FKP weights are applied). For 
iftb2 < 20.45, A and B are set to the A(20.45), 5(20.45) given 
by the above equations. 

The residual relationships after applying the Wstar weights are 
displayed in red in Fig.fTT] — the relationship with Galactic extinc- 
tion changes from having a slightly negative to a slightly positive 
slope and the seeing, sky background, and airmass relationship re- 
main similar to the unweighted case. For all but seeing, the relation- 
ship appear consistent with the variations we expect due to cosmic 
variance (as shown by the error-bars in Fig. fTTb. Further, the sum 
of the five potential systematic contributions is consistent with the 
mean mock sum when the Wstar weights (as shown in Fig.|13[l are 
applied to the CMASS data. We therefore believe the Wgtar weights 
are appropriate to apply to the CMASS sample. Additionally, we 
believe the ifib2 dependence of the Watar weights should (mostly) 
account for changes in redshift. In the appendix, we compare the 
Wstar weights to two other weighting schemes and find that the ap- 
plication of the Wstar weights has the least potential to remove true 
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Figure 15. The measured redshift space correlation functions of CMASS galaxies using the fiducial catalog (red triangles), and applying weights that connect 
for the linear relationships between galaxy density and stellar density and ifn,2 magnitude (blue circles). The bottom panels display the difference between the 
measured ^q and ^2 and that mean calculated from the mock go and ^2- Black error bars represent the standard deviation of the mock go and ^2- We analyze 
the apparent feature at s ~ 200/i~^Mpc in Section|7] 



fluctuations from the density field and does not bias the clustering 
measurements of our mock galaxy samples. 



5.3 Effect of Angular Weights on CMASS Clustering 

The top panels of Fig.fTsldisplay the resulting ^o (s) and ^2 (s) mea- 
surements for CMASS galaxies for the fiducial sample (which in- 
clude FKP weights and weights for close-pairs and redshift failures, 
see Fig. Bl, and when we include the Wstar weights. As expected 



(Fig. [T3J, the weights for stellar density cause a nearly constant 
decrease of 0.002 in the measured ^q. This change is greater than 
the statistical uncertainty at scales greater than 110 h^^Mpc. The 
Wstar weights only have a slight effect on the ^2 measurements. 
The difference is always smaller than the statistical uncertainty, 
reaching ~ 0.5a at the largest scales. 



As described in Section [33] we can scale the mocks to deter- 
mine a best-fit bias, b. The consistency of the different weighting 
schemes can be further tested by comparing the best-fit bias and 
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Figure 16. The redshift distributions of CMASS objects in the Northern Galactic Cap (NGC), splitting the area in half on seeing, i-band sky background in 
nanomaggies per square arcsecond, the r-band Galactic extinction (Ar), airmass, and the number density of stars (nstar)- The errors are determined by finding 
the standard deviation of the mock n{z). The upper-right panel displays the result with (red) and without (blue) FKP weights, using solid lines for the NGC 
sample and dashed lines for the SGC sample. 



associated Xmin^ which we determine in all cases using the co- 
variance matrices calculated using the mocks. Applying the Wstar 
weights to the CMASS sample when calculating ^(s) and fitting 
between 25 < s < ISO/i'^Mpc, the best-fit b = 1.936 ± 0.035, 
with Xmin ~ 22.7 (18 measurements). When we do not apply the 
Wstar weights to the CMASS sample, the best-fit bias increases by 
~ 0.5o- to 6 = 1.949J:o;o35, and the xLin increases to 34.2. As- 
suming Gaussian statistics, only 1.2% of consistent samples would 
have a x^ > 34.2, while 20.2% would have x^ > 22.7. The values 
of b (and other parameters we measure throughout this section) are 
summarized in Table [T] (found in Section[8}. 

We find similar results when we fit the weighted and un- 
weighted 5o measurements for both b and a (again in the range 
25 < s < 150/i~^Mpc). The xLn = 33.9 at q = 1.009, 
b — 1.971 when the weights are not applied; marginalizing over 
the bias, we find a = 1.007 ± 0.019. When applying the Wstar 
weights, we find xl^in = 21.5 is at q = 1.019, b = 1.982; 
marginalizing over the bias, a = 1.020 ± 0.019. We note that this 
test only reflects the level of systematic change we should expect in 
derived cosmological parameters. The change in the a value there- 
fore suggests that the application of the weights could cause a shift 
in the best-fit cosmology of close to two-thirds a when constraints 
are derived from the full shape of ^p. [Anderson et alTl ( |2012| ) mea- 
sure the same BAO peak position, to within 0.1%, whether or not 
the Wstar weights are applied, suggesting that the change in our a 
measurement reflects the change in the shape of ^o(s)• 
Fig.[T5]suggests that the weights have little affect on ^2- This 
is confirmed by fixing b at the best-fit value from the ^o measure- 
ments and finding the best-fit / value, by scaling the mocks. When 
the Wstar weights are applied, we find / = 0.711 ± 0.044 with 



Xmin 



11.8 (18 data points); when the weights are not applied. 



we find / = 0.710 ± 0.044 with xtni„ = 12.7. 



6 RADIAL SELECTION FUNCTION 

We may be concerned that any parameter that causes a systematic 
effect in the angular distribution of galaxies may also cause change 
in the redshift distribution. To test this possibility, we split the sam- 
ple in half, based on each of the same five potential systematics in 
turn, and determine the redshift distributions. The results are shown 
in Fig. [T6] when using the weights fit to the linear relationships be- 
tween galaxy density, stellar density and ifib2 (Wstar)- None of the 
distributions are significantly outside of the errors we determine 
based on the standard deviation in the distributions of mock sam- 
ples within the NGC or SGC footprints (though we only plot the 
results for the NGC; see Section[33](. 

We use FKP weights ( Feldman et al.| 199 4l>, defined by Eq.|26| 
to optimally weight the data as a function of redshift. This changes 
the n{z) from the blue curves to the red in the upper- right panel 
of Fig. [T6] with solid lines representing data from the NGC and 
dashed lines for the SGC. Including the FKP weights effectively 
equalizes the contribution of every redshift interval we consider 
in the ^^ calculation. This is illustrated by the fact that the mean 
redshift changes from z = 0.55 to z = 0.57. The FKP weights 
also make the NGC and SGC selection functions more similar to 
one another. 

We display ^i measurements, with and without FKP weights, 
in Fig. [17] This serves as a check that these weights have not im- 
parted any systematic errors and illustrates the advantage of apply- 
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Figure 17. Top panels: The measured redshift space auto-correlation func- 
tions of the combined CMASS sample measured with (blue) and without 
(red) FKP weights. Points represent measurements; error-bars are deter- 
mined using from the variance of the mock calculations. Bottom panels: 
Points display the difference between the measurements using and not us- 
ing the FKP weights. Error-bars represent the mean and the variance of the 
difference using and not using FKP weights for the mock calculations. 



ing FKP weights. For ^o, the amplitudes are marginally higher at 
all scales when the FKP weights are applied. This result is due in 
part to the fact that the FKP procedure assigns larger weights to 
the higher redshift data, which is more likely to include more lu- 
minous galaxies which thus have higher bias. We find, as expected, 
that the application of the FKP weights reduces the uncertainty on 
derived parameters by at least 10% and that the derived / and a 
are consistent to within 0.5cr whether or not the FKP weights are 
applied (without FKP weights, we find a = 1.033 ± 0.025 and 
/ = 0.711 ± 0.044). 



6.1 Testing models of the radial selection function 

To model the expected galaxy distribution we must assign redshifts 
to the random catalogs we used to calculate ^^. This is difficult to 
achieve without using the data itself, as we would need a full the- 
oretical model for the galaxy population targeted. Without such a 
model, we are limited by the fact that we can only estimate the true 
n{z) empirically. However, we can test the effects of this depen- 
dence using the mocks. The mocks were constructed assuming a 
fiducial n{z) (which is the n{z) we measure from the data) and the 
n{z) of each individual mock will scatter around this input n{z) 
due to cosmic variance. For each mock, we consider three methods 
to determine the n{z) applied to the random catalog: 

(I) 'spline', where a spline fit to the observed redshift distribu- 
tion of galaxies, using bins of width Az = 0.01, is used to deter- 
mine the n{z), we sample from this to construct a random catalog 
and 

(II) 'shuffled', where for each point in the random sample we 
assign the redshift of a randomly-selected redshift from the galaxy 
sample. 



o 

UV 0.000- 



- o statistical 

- shulTied 
splinetO 

-spiine20 
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Figure 18. Top panel: Average offsets and la deviations from a true ^o 
for different methods of assigning redshifts to random catalogs, determined 
using the 600 mock galaxy catalogs. The solid black line corresponds to the 
statistical errorbars. Bottom panel: As above, for §2- 



(ill) 'true', where we use the input n[z). The true n{z) is, of 
course, not available for any observed sample and we test the dif- 
ference between clustering measured using the true n(z) and either 
the spline of shuffled n{z) in this section. 

We create a random catalog for each of the 600 individual 
mocks using both the spline method and the shuffled method and 
compare the results to those derived using the true underlying n{z). 
For the spline method, we use an A'^-node spline. We examine the 
cases where A'^ = 10, 20, 30, 50. We expect the results using the 
Ai'-node spline to approach the limit of the results of the shuffled 
catalog when A'^ is very large. Importantly, we can compare all re- 
sults to the true case, thereby quantifying the bias and additional 
uncertainty imparted by the need to self-average as a function of 
redshift, and thus addressing the concerns raised in |Sylos Labini et| 
[ar] p009| >. 

The top panel of Fig.fTslshows the average bias of ^o measure- 
ments and its standard deviation, determined using 600 realizations 
of 5o computed from mocks using different random catalogs. For 
the measurements of the monopole the average bias for all methods 
of constructing a random catalog is a small fraction of the statisti- 
cal errors and the standard deviation of the bias is about a third of 
the statistical errors. The bias is smallest when using the shuffled 
random catalog and appears negligible for ^o • 
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Figure 19. Contours showing the 68%, 95%, and 99.7% confidence level 
contours on bias and growth factor estimated from ^o ^nd §2 using different 
methods to determine the radial distribution used for the random catalogs. 



The bottom panel of Fig.fTslshows the results of the same test 
on the measurements of ^2 • For the second Legendre moment of the 
correlation function the systematic offset is larger than for ^o, but 
is still small compared to the statistical errors. The standard devia- 
tion of the bias is ~50% of the statistical errors. For both monopole 
and quadrupole of the correlation function, the shuffled catalog per- 
forms the best; however, the average bias for both ^o and ^2 is also 
small for the TV-spline methods. 

The potential systematic error induced by the treatment of the 
random catalog appears largest for ^2- To see the effect of n{z) 
systematics, we find the best-fit values of bias b and growth rate / 
when performing a joint fit to ^0 and ^2 for each mock catalog, first 
by using the random catalog with the true n{z) and then repeating 
the same analysis using A'^-splined and shuffled random catalogs. 
Figure \T9\ shows the 1, 2, and 3a contours on the joint measure- 
ment of b and / when using the true n{z) (black), a shuffled n{z), 
and a 10- (green), 20- (blue), 30- (cyan) node spline. The n{z) sys- 
tematics push the measured bias towards slightly higher values and 
the measured growth rate towards lower values but all contours are 
consistent within la. The results derived with the shuffled catalog 
are on average closer to the results derived with a true catalog than 
results obtained using a spline fit. Similar fits are performed to the 
CMASS ^i in |Reid et al.| l |20T2l >. 

The tests outlined in this section suggest that there is some 
systematic uncertainty introduced by the method in which random 
points are assigned redshifts. In general, this causes both an in- 
crease in the statistical uncertainty and a systematic bias. The added 
statistical uncertainty is at most 5% (as given by Vl + 0.33^) of 
the fiducial uncertainty for (,o{s) and 12% of the fiducial uncer- 
tainty for ^2(s). This added statistical uncertainty is accounted for 
by measuring the mock correlation functions used for the covari- 
ance matrix using the same method as we employ on the data. Fig. 
[20] shows that difference between the CMASS S,e measurements 
made using a 20-node spline and shuffled random catalog is indeed 
at the level we expect from the mocks. 

The systematic bias induced by the treatment of the randoms 
is negligible for ^o(s), but is larger for ^2(5). In both cases, using 




-0.004 



200 



Figure 20. Top panel: The red line displays difference between ^o mea- 
surements made using the Northern Galactic Cap (NGC) CMASS sample 
when using a 20-node spline and when randomly selecting redshifts from 
the galaxy sample ('shuffled') to assign redshifts to the random catalogs. 
The error bars represent the mean and standard deviation of this differ- 
ence in the 600 mock galaxy catalogs occupying the NGC footprint. Bottom 
panel: As above, for §2- 



a shuffled random catalog, on average, produces the least biased 
result. Therefore, we use the shuffled method to obtain redshifts 
for the random catalogs we use in ^(s) calculations. 



6.2 Clustering Split at z = 0.52 

One may worry that the clustering at higher redshift may be more 
prone to systematic errors, given that, all else being equal, higher 
redshift objects should be fainter Therefore, we split our CMASS 
data into two samples, one with z < 0.52 and the other with 
z > 0.52. This split is close to the peak of the redshift distribution 
and represents the redshift at which the CMASS sample transitions 
from being approximately volume-limited to magnitude-limited (as 
can be inferred by inspecting the n{z) in, e.g., Fig.pl. We also find 
that the Wstar weights become much more important at z > 0.52, 
as the mean ifib2 magnitude is 21.01 above z — 0.52 and 20.74 be- 
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Figure 21. Top panels: The measured redshift space auto-correlation func- 
tions of the combined CMASS data split by redshift into samples with 
z > 0.52 (red) and z < 0.52 (blue). Points represent measurements; error- 
bars are determined by from the mocks split at the same redshifts. Bottom 
panels: The difference between the CMASS measurement and the mean of 
the mock measurements scaled to the best-fit bias of the respective CMASS 
samples. The values of s for the z < 0.52 sample have been shifted hori- 
zontally by 1 /i~^Mpc for clarity. 



low. Thus, we expect differences in ^(s) measurements to be great- 
est when split at z = 0.52 (and indeed, differences in the measured 
(^i are smaller when we split at, e.g., z = 0.55). 

The resulting ^(s) are displayed in the top panels of Fig. 21 
with open blue circles representing z < 0.52 and red triangles rep- 
resenting z > 0.52. The amplitudes of the z > 0.52 measurements 
are significantly larger at all scales than the lower redshift ones. 
This may partially be due to the fact that galaxies at 2 > 0.52 are 
more luminous, and thus we may expect them to have a higher bias. 
We fit these data to our mocks between 25 < s < 150/i~^Mpc 
and account for the 6.4% change in the linear growth factor be- 
tween z = 0.61 and 0.48 (the mean redshifts of the respective 
samples). We indeed find a higher bias for the z > 0.52 sample, as 
h = 2.02 ± 0.04 for 2 > 0.52 (xLn = 22.9) and 6 = 1.85 ± 0.06 
for 2 < 0.52 iXmin ~ 15.1). The values of b (and other parame- 
ters we measure throughout this section) are summarized in Table 
[T](found in Section[8}. 

The bottom-left panel of Fig. [2T| displays the difference be- 
tween the measurement and the mean of mocks (600 each for 
z < 0.52 and 2 > 0.52), after scaling the mocks to the best-fit 
bias. Even after this is done, the amplitudes of the 2 > 0.52 ^0 
measurements are larger across all scales than the 2 < 0.52 coun- 
terparts. This illustrates the level of covariance between s bins in 
the ^£ measurements (which allows best-fit solutions where most 
of the data is either above or below the model). Both measurements 
are consistent with the mocks at scales 150 < s < 250/i~^Mpc 
(14 data points), as x^ = 7.6 for 2 < 0.52 and 17.7 for 2 > 0.52 
(91% of consistent samples will have x^ > 7.6 and 22% will have 
x' > 17.7). 

The right-hand panels of Fig. [21] display the same content as 
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Figure 22. The measured monopole of the correlation function, ^0. mul- 
tiplied by s'^ for the combined CMASS samples and three mock catalogs 
with similar levels of clustering at s ~ 200/i~ ^ . 



the left, but for ^2. The values of j^2J are consistently smaller for 
2 < 0.52. We test the significance of this result by fixing the bias at 
the value determined frorn £q and scaling the mean of the ^2 mocks 
to find a best-fit /, via Eq, 
14.1); for 2 < 0.52, / 



37 



For 2 > 0.52,/ = 0.75±0.05(x^ = 
0.59 ± 0.08 (x^ = 6.9). Accounting 
for the fact that we expect a 6% decrease in / between 2 = 0.61 
and 0.48, this is a l.Bcr discrepancy and is not surprising given we 
intentionally split the sample at a redshift where we expected to 
find the largest differences. 

We find that splitting the sample at 2 = 0.52 yields consistent 
a values when marginalizing over the bias: for 2 < 0.52 we find 
a = 1.016 ± 0.038 and for 2 > 0.52 we find a = 1.013 ± 0.021. 
Both best-fit a values are smaller than the best-fit for the combined 
sample {& = 1.021 ± 0.019), implying that the cross-correlation 
between the two slices contains significant information relevant to 
a. The consistency of the results further implies that fits to standard 
ACDM cosmological parameters will yield consistent results. We 
find similar levels of consistency when splitting the individual NGC 
and SGC regions at 2 — 0.52 and performing the same tests. 



7 CLUSTERING AT THE LARGEST SCALES 

To this point have focused on scales s < 150/i^^Mpc. In this sec- 
tion, we focus on larger scales. Models of the galaxy correlation 
function in a Universe dominated by dark energy and cold dark 
matter cross zero at a scale just beyond the BAO peak and asymp- 
tote towards zero. This is true even for models with a high level of 
primordial non-Gaussianity (where the amplitudes around the BAO 
scale and zero-crossing scale increase). Thus, ^o(s) measurements 
that differ from this behaviour indicate the presence of systematic 
effects in the galaxy density field, effects not accounted for in the 
standard paradigm, or correlated noise. 

Comparing our measured ^0 (using the Wstar and FKP 
weights) to the mean of that of the mocks between 150 < s < 
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Figure 23. Left panel: The measured correlation function, ^{r± , »'| i ), plotted a function of the radial, ri i , and transverse, rj_ , distance for the CMASS sample. 
To compare all scales, we plot the hyperbolic sine of (755[rj^,r||]). Middle panel: The mean sinh(755[rx,r||]) of 600 mocks masked to simulate the 
Northern Galactic Cap (NGC) footprint of the CMASS data set. Right panel: {i;cMASS [''X i ''I I ] ~ imock ['"± > ^l 1 ] ) /2o", where cr is the standard deviation of 
the ^rnock [''J- 1 ''I I ] (^nd sinh scaling is no longer used; we divide by 2a, rather than 1 , for clarity). 



250/i~^ (14 measurements), we find x^ = 27.3. This value is 
rather large, as only 2% of consistent samples have a greater x^ 
value. We note that if the Wstar weights are not applied, x^ ~ 57.3 
(only 3 X 10~^% of consistent samples would have such a large 
X^)- If we do not use the FKP weights, but still use the FKP co- 
variance matrix (to make sure the measurement and not the covari- 
ance is driving the x^), the x^ decreases to 23.2. This is still large 
enough that only 5.7% of consistent samples would have a larger 
X^- This discrepancy is a product of the full sample, over all red- 
shifts (and the associated lower variance), as neither sample when 
split at z — 0.52 returned an abnormally large x^ value. The best- 
fit bias of the CMASS sample (when using FKP weights) is 2% 
higher than that of the mocks ( 1 .938 compared to 1.9) and, naively, 
X^ oc 6"*. Performing such a scaling reduces the x^ to 25.2 — still 
large enough that we should expect a larger x^ value for only 3.3% 
of consistent samples. 

The effect of any systematic associated with the angular mask, 
e.g., variations in stellar density or errors in the normalization of 
pair counts, is to add roughly a constant amplitude to ^o- There- 
fore, we add a constant. A, to the fo of the mocks and find the 
best-fit value, fitting between 150 < s < 250/i~^Mpc. The x^ is 
minimized at ^4 = 0.0006, but is only reduced to 25.8 (23.8 if scal- 
ing by the bias). Given that we have reduced the number of degrees 
of freedom to 13, the probabilities of being consistent remain the 
same (to within the quoted number of significant digits). That is, A 
(and, thus, any remaining purely angular systematic) is not detected 
with any significance. 



7.1 Anisotropic Clustering and Feature at 200 h ^ Mpc 

The inconsistency we find between the measured clustering and the 
mean of mocks appears driven by an excess at s ~ 200/i~^Mpc. 
No matter how we split the CMASS sample or change our analysis, 
the bump-like feature around 200 h^^Mpc remains. It is strange 
that this feature is nearly as robust in the NGC and SGC samples 
alone and also in both the samples split at z = 0.52. However, we 
can find mocks with similar large-scale ^o to the CMASS sample. 
The ^0 of three such realizations are plotted along with the ^o of the 
combined CMASS sample in Fig.|22] Clearly, cosmic variance al- 




Figure 24. The average value of (5[r[| , r±] - ^rnock [r\\, rx])/o-(r|| , r±) 



calculated in rings with Ar 

(r + Ar)2. 



lOh^^ and (r - Ar)^ < r?, + r^ 



lows the possibility of obtaining peaks in ^o at around 200 h^^Mpc 
that are qualitatively similar to that we observe. 

We investigate further by examining the clustering of BOSS 
galaxies as a function of radial, r||, and transverse, rj_, distances, 
^(rx,r||). Given the redshift-space separation s and the cosine of 

the angle to the line-of-sight /i, r| | = fis and r± = . /s^ — r?, . 
The left-hand panel of Fig. [23] displays the hyperbolic sine, sinh, 
of 75 times £,{r±, r||). This transformation allows information on 
all scales to be displayed on a single figure. The effect of redshift 
space distortions is apparent, as amplitudes along the line-of-sight 
are clearly decreased relative to those at the same transverse sepa- 
rations. 
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Figure 25. The measured spherically-averaged (in redshift-space) power spectrum, P{k), of the full DR9 CMASS 
sample with only the fiducial angular weight applied (open circles) and with the linear-fit weights for stellar 
density, Wgtar, applied (solid circles). The average of the mock P{k) is displayed with a dashed line. The best-fit 
smooth model P{k) determined by Blanton et al. 2012 is plotted with a solid line. The inset panel displays the 
same information, divided by the smooth-fit, with the solid line displaying the best-fit model including BAO. 



There is a ring at around 100/i^^Mpc, as expected for the 
BAO feature. The extra information in the r±,r^^ dependence of 
the BAO feature is examined in Blanton et al. (in prep.). There also 
appears to be an excess in a ring around 200/i^^Mpc. The middle 
panel displays the mean sinh(755[rx, ^n]) in the mocks. At scales 
less than 100/i^^Mpc, this has a similar appearance to the mea- 
surements. The right-hand panel displays the difference between 
the measurements and the mocks, divided by twice the standard 
deviation of ^mocfc[''±,?'||]. In general, there is excess at the largest 
scales, and is most pronounced at scales ~ 200/i^^Mpc. The ex- 
cess at r ~ 200/i^^Mpc is largest for pairs between 10° and 50° 
from the line-of-sight. The fact that the excess appears at approxi- 
mately constant r, rather than r± or r 1 1 , suggests that the feature is 
not due to a systematic strictly related to the angular mask (as this 
would show up at constant r± ) or the process of obtaining spectro- 
scopic redshifts (as this would appear at constant r||). Finally, the 
feature shows up at nearly identical physical scales when the sam- 
ple is split at z — 0.52 (see Fig.|21[l, further implying the feature is 
not associated with a fixed angular scale. 

To further assess the significance of the feature at 
~200/i~^Mpc, we design a statistic that reflects the degree to 
which there is an excess (or decrement) of signal at constant sep- 



aration. Thus, we determine the difference between the measured 
correlation function and the mean of the mock correlation func- 
tion divided by the standard deviation determined from the mocks, 
o-(r||,rx), averaged over measurements within a constant separa- 
tion bin. Thus, we measure 



t(r, Ar) 



Y.Qir\\,r±){^[r\\,r±] - Cmocfe[r||, rx])/cr(r||, rx) 



where Q{r 



Ee(r||,rx) 

l|,7-±) = lif (r- Ar)^ < rf, + r^ < (r + Ar)^ and 
otherwise. In essence, this statistic is just a binned version of the 
information plotted in the right panel of Fig.|23[ 

Fig.[24]displays the results of performing this test on the mea- 
surements, setting Ar = 10/i^^Mpc. The most discrepant results 
are at r = SOh^^Mpc, although this is likely suggestive of a dif- 
ferent preferred cosmology from the one used by the mocks. At 
r — 204/i~^Mpc, we find a peak with amplitude 0.72. If we 
perform the same test on each of the 600 mocks, we find a peak 
with amplitude greater than 0.72 at r > 120/i^^Mpc in 84 of the 
mocks, which suggests that the observed size of the peak is com- 
mon in the mocks. If we demand that the peak be at least as wide 
as we find in the CMASS data by searching for features where 
t > 0.5 over 25/i^^Mpc and tmax > 0.72 (as is the case for 



(42) 
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the CMASS data), only 29 mocks (4.8%) are selected. This implies 
that the combination of the size and width of the feature centered at 
200/i~^Mpc is driving the large x^ values of the ^o measurements 
with s > 150ft"^Mpc. The local maximum at r = 100/i~^Mpc, 
tmax = 0.45, and t > 0.3 over 25/i~^Mpc is not significant, as 
45% (267) of the mocks have a feature at least as large and wide 
centred at r > 80/i~^Mpc. 

Fig. |22] displays the ^o of three realizations where tmax > 
0.65 and is close to 200/i^^Mpc, showing that these cases are 
qualitatively similar to the CMASS ^o- The SDSS I/II LRG ^o{s) 
measurements have larger than expected amplitudes at large scales. 
However, a systematic study of the SDSS I/II LRG sample, sim- 
ilar to our own, has not been published and Xu et al.| ( |2012^ still 
suggest its large-scale clustering amplitudes are within 2a of those 
expected. The final BOSS data set will have three times more data 
than the DR9 sample, and thus future data releases will confirm 
if the feature at 200h~^ is simply noise, a yet-to-be-determined 
systematic, or a real feature in the clustering of galaxies. We do 
not know of any model that predicts such a feature, as, e.g., the 
predicted ^(s) given a strongly non-Gaussian primordial power- 
spectrum display much smoother variation. This is studied further 
in Ross et al. (in prep.). 

7.2 Power Spectrum Measurements 

We investigate results using the spherically-averaged power spec- 
trum, P{k), in order to isolate the large scale density modes (i.e., 
low k; the P(k) measurements are less covariant between k bins) 
and as a consistency check on results derived from ^o measure- 
ments (which should contain the same information). In Fig. |25] 
we display the measured P{k), calculated as described in Sec- 
tion ISTT] The open circles show the measurement without using 
the weights accounting for stellar density, while the solid circles 
display the measurement when these Wstar weights are applied. 
These weights only cause a significant difference for the smallest k 
(largest scales), unlike the situation for ^o(s), where the difference 
was nearly independent of scale. The inset panel displays the same 
information, divided by the best-fit smooth model found in |Ander-"| 
[son et al^ ( |20I2i l. The open and solid circles are indistinguishable 
from each other. Clearly, the Wstar weights do not affect the P{k) 
measurements at the scales related to the BAO feature. 

We scale the mock P{k) to find a best-fit b values in the same 
manner as performed throughout for f (s). We determine the co- 
variance matrix of ln[P(fe)fc^], from the 600 mock catalogs and 
minimize the x^ of\n[P{k)k^]. We use the logarithmic scaling to 
account for the fact that we expect the cosmic-variance contri bution 
to the P{k) uncertainty to be proportional to P^ (fc) (see, e.g.. 



Feld- 



|manetal.|1994) . This scaling does not significantly alter the best-fit 
values we determine, but it does result in significantly smaller Xmin 
values. We find b = 1.983 ± 0.035 fitting k < 0.05/iMpc"^ with 
Xrnin ~ 19.2 (11 dcgrccs of freedom) when the Wstar weights 
are applied and b — 2.001 ± 0.035 with Xmm = 38.8 when no 
weights are applied. The Xmin increases dramatically, by a factor 
greater than 2, without the weights. This shows how dramatic an ef- 
fect the weights have — only 0.006% of consistent samples would 
have a x^ > 38.8, while 5.8% would have x^ > 19.2. 

The effect of the Wstar weights on the P{k) measurement 
shows significant scale dependence, unlike for ^(s), where the 
change was nearly constant. We can assume that any unaccounted- 
for systematic will have the same k dependence as the Wstar 
weights and determine if adding a factor A[P{k) — Pweight(k)] 
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Figure 26. The P{k) measurements for the Northern (NGC; open red tri- 
angles) and Southern Galactic Cap samples (SGC; open blue squares), with 
the mean of the respective mock samples displayed with a solid line. The 
difference between the two lines illustrates the effect of the different win- 
dows of the NGC and SGC on the expected P{k). 



1.974 ± 0.035 and A = -0.41. That is, a 41% stronger system- 
atic correction decreases the x^ by 5.2. These results strongly sug- 
gest that proper treatment of the weights are vital in any attempt 
to obtain robust measurements that use P{k) measurements at low 
k, e.g., measurements of primordial non-Gaussianity or the scale 
of matter radiation equality (from the overall peak in P{k)). The 
degeneracy between the systematic correction and the constraints 
on primordial non-Gaussianity one can obtain is studied further in 
Ross et al. (in prep.). 

The best-fit bias obtained from the P{k) measurement is 
nearly Icr larger than what we obtain when ^(s) is fit at scales 
> 25/t^^Mpc. Measurements of absolute bias values are notori- 
ously difficult, and the recovered value is often driven by the min- 
imum scale that is fit (since this measurement has the least uncer- 
tainty; see, e.g., [Ross et al.|201Ia i. Indeed, it is difficult even us- 
ing dark matter simulations, as Manera et al.| ( [20I0l > found system- 
atic differences (at a level similar to what we find here) in large 
scale bias measurements when comparing results obtained from 
matter-halo cross-power spectrum and halo auto correlation mea- 
surements. Thus, obtaining robust absolute bias measurements is 
a generic problem rather than an issue specific to the BOSS DR9 
galaxy sample. 

displays the P{k) measured for the NGC (open tri- 



Fig. 



26 



improves our b fit. We find that the Xn 



13.96 at b 



angles) and SGC (open squares) samples and the average of the 
mock P{k) for each region separately. The windows of the re- 
spective regions create a large difference in the expected P{k), 
as the shape of the mock P{k) are considerably different at small 
k. Clearly, a direct comparison of the respective P{k) is not ap- 
propriate. Scaling the mock P{k) for the respective regions, we 
find b = 1.991 ± 0.039 for the NGC with Xmin = 18.5, and 
b = 1.93 ± 0.07 for the SGC (xL,n = 11-9). The difference be- 
tween the respective b is opposite what we found from the ^o(s) 
measurements, where the bias of the SGC sample was significantly 
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Figure 27. Solid lines display the ratio between the CMASS P{k) mea- 
surement to the average of the mock P{k). The vertical dotted lines denote 
a spacing of 0.029 /iMpc~l in k, with the first line at fc = 0.027/iMpc~^. 



larger. This suggests that smaller scale modes (k > 0.05) have 
significant influence on coiTelation function measurements of the 
Southern sample at s > 25/i~^Mpc. Unlike for ^(s), the bias val- 
ues determined using P(k) are consistent between the two regions. 



Fig. 27 displays the ratio between the measured P{k) and 



the mean mock P{k) when scaling k linearly. The measured ^o 
has a peak at 204/i~^Mpc, which suggests we should find a peri- 
odic feature in the P{k) with wavelength ~0.03 /iMpc^^ in k. We 
have plotted dotted lines with a spacing of 0.029 /iMpc~^, with 
the first at fc = 0.027/iMpc^^. There appears to be local minima 
at k values near each dotted line. This appears most significant at 
k ~ 0.03ftMpc~^. We caution that the possibility this feature is 
due to an undiscovered systematic may need to be considered when 
performing analysis of the shape of the CMASS power spectrum. 



8 CONCLUSIONS 

We have investigated potential systematic effects on the three- 
dimensional clustering of the DR9 sample of BOSS galaxies and 
tested the robustness of the results when we split the data into 
the regions we expect may be the most different for observational 
(Northern/Southern Galactic Caps) and physical (split at z = 0.52) 
reasons. Our main findings are summarized as follows: 

• Redshift failures occur at preferred locations on spectroscopic 
tiles (see Fig. 3), but the nearest redshifts within the same sector 
(which therefore share the same observing conditions) have an n{z) 
like that of the overall sample (see Fig. 4). We therefore account for 
redshift failures by up-weighting the nearest targeted object within 
the same sector. We find this approach has a minor effect on the 
measured ^(s) (see the red lines in Fig. 5). 

• We account for target objects that lack spectra due to fiber col- 
lisions by up-weighting the nearest targeted object. At large scales 
(s > 10/i^^Mpc), this should be equivalent to the more sophisti- 
cated method proposed in|Guo et al.H201 1). This scheme increases 



the ^(s) measurements in a manner consistent with a small increase 
in the galaxy bias (see the blue lines in Fig. 5). 

• The overall number densities of observed galaxies in the South- 
ern Galactic hemisphere are higher for both the LOWZ (8%) and 
CMASS (3.2%) samples. If we apply the offsets in color found by 
[Schlafly & Finkbeinerj ( |2011| l to the selection of BOSS galaxies, 
the number densities become consistent within 2% for LOWZ and 
0.2% for CMASS. After applying the |Schlafly & FinkbeinerlpOTT) 
offsets, the n{z) are discrepant at a level that we find in 10% of 
mock samples; that is, the difference is not significant. 

• The measured clustering in the NGC and SGC generally agree 
to within 2a, depending on the specific test that is performed. For 
^(s), the bias disagrees at 1.9(7, but for P{k), the discrepancy is 
only 0.3ct. Measurements of the amplitude of ^2, /, are consistent 
to within la. 

• The measured clustering in the NGC and SGC disagrees most at 
around the BAO scale; we find this causes a difference in stretch 
parameter, a, (which in this study encodes information on both the 
BAO scale and the shape of ^(s)) of 2.5(t. We find no treatment 
of the data (e.g., alternative weighting) that significantly alters the 
level of discrepancy. 

• We weight CMASS galaxies based on linear relationships be- 
tween the expected number density of galaxies as a function of 
their ifib2 magnitude and the local stellar density (wstar weights). 
We find no evidence that similar weights are necessary for (the 
brighter) LOWZ galaxies. By applying the method used to deter- 
mine the weights to mocks (which have no need for coiTection) 
we find that these weights produce no bias on the mean measured 
£,{s), whereas more aggressive weighting schemes may (see the ap- 
pendix). Application of these weights reduces the measured cross- 
correlations of the CMASS galaxies with stellar density, Galactic 
extinction, seeing, sky background, and airmass to the level we ex- 
pect to find randomly (see Fig. 13). These weights account for red- 
shift dependence through the ifib2 relationship, but may no longer 
be sufficient when the sample is split by color. 

• Applying the Wstar weights produces a 0.7ct shift in the measured 
value of Q, most of which we believe is due to the change in shape 
of the correlation function. Applying the weights reduces the x^ 
from 34.2 to 22.7 when ^o(s) measurements are fit in the range 
25 < s < 150/i~^Mpc (18 data points). The Wstar weights have 
little effect on ^2 measurements, as the best-fit / does not change 
and the x^ is reduced only from 12.7 to 1L8 when fitting scales 
25 < s < 150/i~^Mpc. 

• We use the mocks to determine the least-biased way to simulate 
the radial selection function of CMASS galaxies, in order to pro- 
duce a random catalog. Randomly selecting the redshift of a galaxy 
in the sample ('shuffled') produces a smaller bias than performing 
a spline fit to the measured n{z) ('spline'). In all cases, the bias is 
negligibly small for ^0, but is as high as 50% of the statistical un- 
certainty for 52- We therefore advocate using shuffled random cata- 
logs and note that any constraints obtained using ^2 measurements 
should take this bias into account. We find that the differences be- 
tween ^0 and ^2 measurements we obtain using CMASS galaxies 
using the spline and shuffled methods are consistent with the level 
found in the mocks. 

• The ^o(s) measurements, when split at z — 0.52, yield bias and 
a values that are consistent within la, when fit in the range 25 < 
s < 150/i~^Mpc. The ^2 measurements are somewhat discrepant, 
as the best-fit / values differ by 1.7a. 

• The ^o(s) measurements, at scales s > 150/i~^Mpc, are incon- 
sistent at a greater than 94% level with the expected clustering. 
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Table 1. The parameters and Xmin, ^^ derive from the clustering of BOSS DR9 CMASS galaxies, for different treatments and subsamples of the data: b is 
a measure of the amplitude of the measured clustering, / is a measure of the amplitude of ^2 , and a measures the prefeiTed dilation in scale, relative to the 
average of the mock go ■ 



Estimator Hemisphere 



2 range 



Wstar weights? 



b, x^/dof 



/, x^/dof 



?(«) 


Both 


0.43 < z < 0.7 


yes 


P(k) 


Both 


0.43 < z < 0.7 


yes 


m 


NGC 


0.43 < 2 < 0.7 


yes 


p{k) 


NGC 


0.43 < 2 < 0.7 


yes 


as) 


SGC 


0.43 < 2 < 0.7 


yes 


p(k) 


SGC 


0.43 < 2 < 0.7 


yes 


m 


Both 


0.43 < 2 < 0.52 


yes 


m 


Both 


0.52 < 2 < 0.7 


yes 


as) 


Both 


0.43 < 2 < 0.7 


no 


p{k) 


Both 


0.43 < 2 < 0.7 


no 



1.936 ± 0.035, 22.7/17 

1.983 ±0.035, 19.2/11 

1.904 ± 0.039, 24.3/17 

1.991 ±0.039, 18.5/11 

2.06 ±0.07, 18.8/17 

1.93 ±0.07, 11.9/11 

1.85 ±0.06, 15.1/17 

2.02 ± 0.04, 22.9/17 

1.949l°;JJ^^, 34.2/17 

2.001 ± 0.035, 38.8/11 



0.711 ±0.044, 11.8/17 1.020 ±0.019 



0.691 ± 0.052, 10.9/17 0.994 ± 0.023 



0.79 ±0.09, 11.8/17 

0.59 ± 0.08, 6.9/17 

0.75 ±0.05, 14.1/17 

0.710 ±0.044, 12.7/17 



1.083 ±0.029 

1.016 ±0.038 
1.013 ±0.021 
1.007 ±0.019 



Allowing for a constant offset in large-scale clustering (as angular 
systematics tend to contribute) produces no improvement. 

• The inconsistency at large scales is dominated by a peak in ^o (s) 
at s ~ 200/i^^Mpc. This feature appears in a ring when we mea- 
sure ^{r±,r\\), implying that it is not due to a systematic solely 
to either the mask or target catalog (as would show up in r±) or 
the process of obtaining spectroscopic redshifts (as would show up 
in r||). Similar features are found that are at least as significant in 
4.8% of the mocks that we test. 

• The Wstar weights have a dramatic effect on P{k) measure- 
ments and the effect depends strongly on k. The x^ decreases from 
38.8 to 19.2 before and after Wstar weights are applied to P{k) 
measurements fit at fc < 0.05 (12 data points). We find the x^ 
reduces to 14.0 when allowing a further correction of the form 
A{P[k]„owsight-P[k]u,eight)- TMs implies that one must be care- 
ful when obtaining any information that depends on measurements 
of P{k) at fc < 0.01 and this issue is studied further in Ross et al 
(in prep.). 

The fundamental conclusions of this work are that, for BOSS 
DR9 CMASS galaxies, we recommend the application of weights 
to account for fiber collisions, redshift failures, and the systematic 
effect of stars and we find no further systematics dependencies. We 
therefore expect |Anderson et al. ( 2012| >, Sanchez et al. ( 2012 1, Reid 
|et al.| P012^ , arid |Samushia et al.H2012^ (all of "which apply the 
same weights) to obtain robust cosmological constraints using the 
clustering of BOSS DR9 galaxies. 



Anderson et al. (2012) find the measured BAO position does 
not depend on the application of the Wstar weights — the position 
changes by less than 0.1 a. Therefore, our results do not suggest 
there is a significant unaccounted-for systematic error in previous 
BAO measurements. However, our study does suggest that any pre- 
vious finding of a large-scale excess in clustering measurements 
may have been due to systematic errors, similar to the ones we cor- 
rect for, and requires extra scrutiny. Indeed, |Ross et aT] pOTTbl 
show that much of the excess in large scale power presented in 
[Thomas et al.|pOTT) (who used SDSS-II imaging data) is removed 
when the systematic effect of stars on the projected density field is 
accounted for. 

The SDSS-III BOSS DR9 sample of galaxies represent only 
one third of the final BOSS CMASS sample and one quarter of the 
final LOWZ sample. Further data sets will allow potential system- 
atics to be tested to an even greater extent and reveal whether the 
feature at 200h,~^Mpc is noise. 
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APPENDIX A: ANGULAR WEIGHTING SCHEMES 

We considered three different weighting schemes in order to ac- 
count for the systematic correlations found in Section Bl These in- 
cluded: 

(i) 'iterative weights' which we denote wu : This technique was 
applied in jRoss et al.| ( |2011b| >. It assumes that the effects of each 
systematic are separable, and proceeds by starting with one sys- 
tematic and setting the weight in every Healpix pixel equal to the 
inverse of the quantity plotted in black in Rig.[TT] One then moves 
on to the next systematic and re-calculates the relationship between 
the number density of galaxies and the systematic, and then mul- 
tiplies the weights by the inverse of the relationship. If the effects 
are indeed separable, the Ugai (sys) relationships should all remain 
consistent with unity after all of the weights have been calculated. 

To determine wu, we proceed in the order stellar density. Galac- 
tic extinction, airmass, seeing, and sky background. If each is truly 
separable, the order should not matter, and we do find negligible 
differences for any permutation of the order we have tested. The 
residual relationships between the galaxy number density and the 
potential systematic, when weighting by the full wu, are displayed 
with magenta lines in Rig. jAlj In every case, the relationship is al- 
most fully removed. This implies the weighting is too aggressive, 
as we should actually expect variations consistent with the size of 
the error-bars in Rig. jAlj 

We can test the extent to which the wu weights may remove true 
power from clustering measurements by applying weights to each 
mock sample (which of course contain no imaging systematics) fol- 
lowing the methods we apply to the data. The black triangles in 
Fig. jA2j display the average difference between the fiducial ^ mea- 
surements and when the full iterative weights, wu, are calculated 
and applied to each mock. For the monopole, this decreases the 
expected result by about half the statistical uncertainty (displayed 
with the black dotted lines). There is also a non-zero bias for the ^2 
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Figure Al. Same as Fig.fTT] except we now plot the residual relationships after applying iterative weights (magenta; wn), the residual relationships after using 
a Monte-Carlo Markov Chain to simultaneously fit linear relationships in order to determine the weights (blue; wmcmc) ^nd the residual when the weights 
are split as a function of the fiber magnitude, but calculated only based on stellar density (red; Wstar)- 
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Figure A2. The average difference between the fiducial redshift-space cor- 
relation function of the mocks, §, and that using weights for each mock 
using the full iterative method, (black triangles Wit), and that using weights 
for each mock using only a linear fit to the relationship with stellar den- 
sity (red circles, Wgtar)- Error-bars represent the standard deviation of the 
difference. The black dotted lines display the variance on ^ found in the 
mocks. We note that the mocks do not require weights — a deviation from 
zero implies that a bias is imparted by the weight scheme. 



measurements (top panel), but the difference is insignificant com- 
pared to the statistical uncertainty. 

(ii) 'linear-fit MCMC weights', which we denote wmcmc- 
These weights are calculated by using a Monte-Carlo Markov chain 
(MCMC) to simultaneously find the linear coefficients that best de- 
scribe the total ngijisys) relationships. 

The Wmcmc weights are determined by finding the best-fit so- 
lution to 

rigai/riran = K -{■ Au star + B Ar + C seCi + Dskyi + E air {M) 



where K,A,B,C,D,E are the coefficients we fit for and see is 
the Seeing, sky is the sky background, and air is the airmass. 
This is solved efficiently using a MCMC, as coefficients can be 
applied to the healpix map simultaneously (thereby accounting for 
any covariances between the potential systematics). The value of 
Wmcmc is then the inverse of the best-fit relationship. The resid- 
ual relationships after applying the wmcmc weights are displayed 
in blue in Fig. |A1| These weights allow more variation than the 
Wit weights. However, the sum of {^p,x{reff)'^ /^p(reff) over all 
five potential systematics for CMASS galaxies with the wmcmc 
weights, displayed in blue in Fig. [T3] is substantially smaller than 
we expect from the mocks (black error bars in Fig.|13|l. This re- 
sult implies that the wmcmc weights may also over-correct the 
CMASS galaxy density field and remove true fluctuations. 

(iii) 'linear-fit stellar density weights', which we denote 
Wstar' These weights are calculated by performing linear fits to the 
dependence of galaxy density with stellar density and if 0,2 magni- 
tude, as described in Section [5^ and adopted for the analysis we 
performed throughout this paper. 

The effect of applying only weights for stars, fit to the linear 
relationship between rigai and ristar, on the mocks is displayed in 
red circles in Fig. |A2| The difference is consistent with zero for 
both ^0 and ^2- This suggests that the Wstar weights do not over- 
smooth the galaxy density field. Further, the sum of the five poten- 
tial systematic contributions is consistent with the mean mock sum 
when the Wstar weights (as shown in Fig. [T3j are applied to the 
CMASS data. These results are in contrast to our previous tests that 
suggested both the wu and wmcmc remove true power. Therefore, 
we believe the w^tat- weights are the most appropriate to apply to 
the CMASS sample. 

Finally, we considered the fact that [Schlafly & Finkbeiner| 
pOTTl determined color offsets for every SDSS imaging run avail- 
able at the time of their study. This included 95% of the CMASS 
data in the NGC and 63% in the SGC. Restricting our analysis to 
these regions, we can measure the relationship between the pro- 
jected number density of CMASS galaxies and the offset in d±, 
as given by [Schlafly & Finkbeiner| ( |2011| ). This is displayed in 
Fig. |A3[ where we have applied the Wstar weights to the CMASS 
sample and the error bars reflect the variation found in mock sam- 
ples over the same footprint. The solid line displays the relation- 
ship expected from scalings found in Section H] determined to be 
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Figure A3. The relationship between the projected number density of 
CMASS galaxies versus the offset in Adx. 2S determined from the Schlafly 
et al. (2011) offsets determined for SDSS imaging runs. The error-bars re- 
flect the variation found in mock galaxy catalogs occupying the same area. 
The solid line is the expected relationship, based on the scalings found in 
Section|4] 



ngai/riran ~ 1 + 4. 217Adx, and appears coiisistent wlth what We 
measure. We measured (,o{s) in this region, applying a weight to 
account for the predicted scaling with Ad±, and found negligible 
differences between the recovered measurements and those using 
the full footprint, with separate NGC and SGC selection functions, 
and the Wstar weights (our recommended procedure). Within a sin- 
gle hemisphere, the effect of the offsets is similar to that of seeing 
— there is a systematic relationship, but the pattern of the imaging 
runs is effectively random and therefore the relationship does not 
impart spurious power at scales relevant to our analysis. Ho e t al.| 
( |2012 l reached a similar conclusion when analyzing angular power 
spectra of the BOSS imaging data. The exception is when one con- 
siders the NGC and SGC together, as the mean offset between the 
two regions is large enough to produce a significant offset in the 
number densities of the two regions, and thus imparts a systematic 
error in the large-scale clustering if separate NGC and SGC selec- 
tion functions are not applied. 
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